Ensuring the appropriate organization of research data is essential to prevent errors and clutter in project files.
The structure of the files should be clear to the author, the entire research team, and anyone who might access the data.
It is crucial to use the clearest possible folder structure, during the group project or before sharing a dataset.
Additionally, please remember:
Please avoid the following practices:
File names may contain substantial information about their content. They must be consistent, logical, descriptive, concise, and clear. Establishing a naming convention agreed upon by all project members is essential to prevent unexpected errors. Description elements should be ordered from general to specific.
A file name may include the following elements:
Additional Recommendations:
| Type of Data | Recommended Formats |
|---|---|
| Text Files |
.txt (Plain text) .pdf (Portable Document Format) .tex (LaTeX documents) .html (Hypertext Markup Language) .odt (Open Document Format) .xml (Extensible Markup Language) |
| Tables, spreadsheets, and databases |
.txt/.tsv/.tab (Tab-separated tables) .csv/.txt (Comma-separated tables) Other standard delimiter, e.g. colon, pipe Fixed-width .ods (OpenDocument Spreadsheet) .odb (OpenDocument Database) |
| Image Files |
.tiff/.tif (TIFF) .jpg/.jp2 (JPEG) .png (Portable Network Graphics) .svg (Scalable Vector Graphics) .pdf (Portable Document Format) .gif (Graphics Interchange Format) .bmp (Microsoft Windows Bitmap Format) |
| Sound Files |
.wav (WAVE) .flac (FLAC) .mp3 (MPEG-3) – (.mp3 – usually suitable for human voice and moderate-quality audio, but may not be suitable for high-fidelity audio) .aiff (Audio Interchange File Format) |
| Video Files |
.mp4 (MPEG-4) .mxf (Material Exchange Format) |
| Databases |
.xml (Extensible Markup Language) .csv (Comma-separated tables) |
| Geospatial Data |
.tiff (Geo-Referenced TIFF) .shp, .shx, .dbf (ESRI Shapefile) .kml (Keyhole Markup Language) .nc (Network Common Data Format) |
| Web Data |
.json (Javascript Object Notation) .xml (Extensible Markup Language) .html (Hypertext Markup Language) |
| Web Archive | .warc (WebARChive) |
| Multidimensional Arrays |
.cdf (Common Data Format) .nc (Network Common Data Format) .hdf/.h5 (Hierarchical Data Format) |
| E-books | .epub (Electronic Publication) |
Source: File Formats - Research Data Management - Best Practices - Research Guides at Ohio State University