Large Datasets in Python
Large Datasets in Python
- Authors: Wei Zhan
- Research field: Immunology
- Lesson topic: Processing gDNA chip results and single cell PCR results; finding shared motifs.
- Lesson content URL: https://github.com/uoftcoders/studyGroup/tree/gh-pages/lessons/python/large-data
This lesson contains two parts: 1st, accessing and storing data from personal genomic DNA sequencing results using the pandas.DataFrame structure; 2nd, finding shared motifs in single-cell PCR sequencing results.
To follow along, visit the IPython notebook.
The generic_gdna.txt file contains a sample personal gDNA chip sequencing output.
The generic_tcr.txt file is a tab-separated plain text file containing 10 sample single-cell PCR sequencing results of the T cell receptor.