MLPavementDistressDataAnalysis

It's a class project using Machine Learning for analysis of pavement distress data downloaded from LTPP InfoPave. In this project, construction, annual humidity, temperature and traffic data form LTPP InfoPave was adopted to predict concrete pavement cracking percentage. Linear regression model, decision tree model and random forests model were used to predict and compared with each other. Random forest has advantage in accuracy and decision tree has advantage in speed. Both models did not show their best performance due to lack of effective data.

Acknowledgments

The project team would like to thank U.S. Department of Transportation Federal Highway Administration and LTPP InfoPave for the pavement distress data, traffic data and climate data. The project team would also like to thank Pengyu Xie for his instruction on pavement distress data feature selection.

Data Sources

Website-based Database

https://infopave.fhwa.dot.gov/Data/DataSelection

Data Resolution

At 1 year interval.

Code Environment

IDE: Python Jupyter Notebook. Modules: Pandas,Numpy,Sklearn,Matplotlib,tqdm

Feature Selection

SHRP_ID, STATE_CODE, YEAR, SURVEY_DATE, HPMS16_CRARKING_PERCENT_JPCC, MEAN_ANN_ TEMP_AVG, FREEZE_INDEX_YR, FREEZE_THAW_YR, MAX_ANN_HUM_AVG, MIN_ANN_ HUM_AVG, CONSTRUCTION_NO, AADTT_ALL_TRUCKS_TREND, ANNUAL_TRUCK_ VOLUME_TREND and REPR_THICKNESS.

Merging Logic

Firstly, the humidity and temperature data are merged by 'SHRP_ID', 'STATE_CODE', 'YEAR', 'VWS_ID'. Secondly, merge it with cracking data and traffic data by 'SHRP_ID', 'STATE_CODE', 'YEAR' one by one. At last, merge it with construction data by 'SHRP_ID', 'STATE_CODE', 'CONSTRUCTION_NO'. Considering that the target is JPCC cracking percentage, only the concrete construction data were kept.

'SHRP_ID' is test section identification number assigned by LTPP program. Must be combined with STATE_CODE to be unique. 'STATE_CODE' is numerical code for state or province. U.S. codes are consistent with Federal Information Processing Standards.

'VWS_ID' is code that uniquely identifies virtual weather station.

'YEAR' is the by year temporal information of the data. The cracking data provides ‘SURVEY_DATE’ rather than ‘YEAR’. To match it with other data, the date information has to be transformed into year.

'CONSTRUCTION_NO' is event number used to relate changes in pavement structure with other time dependent data elements. This field is set to 1 when a test section is initially accepted into LTPP and is incremented with each change to the layer structure.

The principle for data merging is to find the unique columns as indices. The indices chosen in this project are time and location.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Figures		Figures
data		data
ClassProject_PredictFaultPercent_Good.ipynb		ClassProject_PredictFaultPercent_Good.ipynb
Creating-a-Model-from-Pavement-Distress-Data.playground.zip		Creating-a-Model-from-Pavement-Distress-Data.playground.zip
Pavement Distress All Datasets.csv		Pavement Distress All Datasets.csv
PavementDistressRegressor.mlmodel		PavementDistressRegressor.mlmodel
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLPavementDistressDataAnalysis

Acknowledgments

Data Sources

Website-based Database

Data Resolution

Code Environment

Feature Selection

Merging Logic

Machine Learning

Models

Figures

About

Releases

Packages

Languages

Crear12/MLPavementDistressDataAnalysis

Folders and files

Latest commit

History

Repository files navigation

MLPavementDistressDataAnalysis

Acknowledgments

Data Sources

Website-based Database

Data Resolution

Code Environment

Feature Selection

Merging Logic

Machine Learning

Models

Figures

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages