This program is aimed at helping researchers submit to European Variant Archive with ease by aiding completion of metadata required by the archive.
Please open an issue in this repository. When reporting an issue, please provide sufficient information to reproduce the error. Some examples include:
- Full error message
- Files associated with error message
- OS information (Linux/Mac OS/Windows)
- All your variant files (vcf, gff, bed, etc.) organized as instructed below
- python
- python module openpyxl
- To install openpyxl, run the following code:
pip install openpyxl
- You can check if you have openpyxl available by doing the following:
python >>> from openpyxl import Workbook
- Once you have openpyxl installed, this should run without errors issued.
- Visit openpyxl website for more information and help
- To install openpyxl, run the following code:
- (Optional) user_info.config file
- (Optional) project_info.config file
- (Optional) analysis_info.config file
- Copy this repository to your local machine:
git clone https://github.com/SichongP/EVA_Instrumentality
cd EVA_Instrumentality
-
Follow the instruction in the next section to organize your files.
-
Run mkmetadata:
chmod +x mkmetadata
./mkmetadata --out output.xlsx path_to_directory_containing_all_projects
mkmetadata will scan all folders under path_to_directory_containing_all_projects
you provided and treat each folder as a project. So make sure you have all project folders under one single folder and no other folders in there
-
Additionally, you can provide a configuration file containing your information so mkmetadata can autofill "Submitter Details" sheet. A template config file can be found at src/user_info.config. See here for more information. To include a user information file, use
--user user_info.config
-
You can also add a file named ".ignore" to the directory you provid. If .ignore file exists, any files found in your project/analysis folder that ends with a keyword on the .ignore list will be ignored.
-
Complete the rest of metadata file. (See FAQ for help)
See here for a sample workflow
The program relies on your file structure to determine the relationships between your projects, analyses, and files. Thus, it is crucial that you organize your files correctly for the program to generate correct metadata file.
Alternatively, you can refer to FAQ to make your own metadata file if you do not wish to re-organize your files
You should have a folder for each project.
All analyses associated with such project should be put in their own folder under the project folder.
Variant files associated with each analysis should be put under the analysis folder.
info.config file should be put directly under the project/analysis folder they belong to.
See below structure (trailing slash /
indicates a directory instead of a file)
- Projects/
.ignore (optional)- project1/
- project_info.config (optional)
- analysis1/
- analysis_info.config (optional)
- file1_1.vcf
- file1_2.vcf
- analysis2/
- analysis_info.config (optional)
- file2_1.vcf
- file2_2.vcf
- project2/
- project_info.config (optional)
- analysis3/
- analysis_info.config (optional)
- file3_1.vcf
- file3_2.vcf
- analysis4/
- analysis_info.config (optional)
- file4_1.vcf
- file4_2.vcf
- project1/