Automatic identification of variables in epidemiological datasets using logic regression
2017
Background
For an individual participant data (IPD) meta-analysis, multiple datasets must be transformed in a consistent format, e.g. using uniform variable names. When large numbers of datasets have to be processed, this can be a time-consuming and error-prone task. Automated or semi-automated identification of variables can help to reduce the workload and improve the data quality. For semi-automation high sensitivity in the recognition of matching variables is particularly important, because it allows creating software which for a target variable presents a choice of source variables, from which a user can choose the matching one, with only low risk of having missed a correct source variable.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
18
References
2
Citations
NaN
KQI