Aims
For the course in “Protein structure prediction of membrane and globular proteins” you will do a project and a presentation on the same subject. In the project you will use your experience from earlier courses to develop a predictor for a “feature” of a globular or membrane protein.
- Develop a functional predictor for the “feature” assigned to you.
- Learn about the latest progress in machine learning bioinformatics.
- Write a report where you compare your predictor as well as review the field for your predictors.
- Make a presentation of a scientific paper.
Mandatory activities (also for grading)
In addition to your project task you should participate in the following seminars. If you fail to do so, you will be given an extra report to write. For details see the schedule.
- Journal club in deep learning – active participation (Fridays 10:00) – Gamma Lunch Room
- Elofsson Lab group meeting (Fridays 13:30) Gamma Lunch Room
- Weekly tasks (see below). These should be submitted at the latest each Friday at 07:00 – feedback will be provided the latest Monday at 15:00.
- Diary (on github) reporting your activities on the project.
- Participation in the oral presentation of your paper.
- Written report including functional predictor (all at github)
Meetings with assistants (voluntary)
If you have problems you have to mail them to the mailing list ahead of time and the TAs will go through these problems two times each week:
- Monday 09:00-10:00 K244
- Wednesday 09:00-10:00 K244
- Find a list of the 5 most relevant papers for your project.
- Select one of these papers and prepare an oral presentation (12 min) after discussion with the teachers.
- Make this oral presentation in a small group of about 5 students (self-organized)
- Provide written feedback to all other members of this group and to the teachers.
- Write a self evaluation (including the written feedback) of your presentation.
- Make a final oral presentation of the paper on March 7
- Extract the feature from your dataset
- Create cross-validated sets
- Train a SVM using single sequence information, using sklearn
- Check different window sizes for the inputs
- Add evolutionary information by running psi-blast and extracting the information
- Train a SVM using multiple sequence information
- Optimize the performance of the SVM
- Analyze the results and compare it to previous work
- Use random forests and a simple decision tree and compare the performance with the SVM performance.
- Extract the data from 50 other proteins and test the performance
- Review the state of art for your predictor
- Write a report
- Final code: 30%
- Code and git commits in weekly assignments 10%
- Final report: 30%
- Oral presentation: 10%
- Diary, 10%
- Journal club, group meetings 10%
- A 92-100%
- B 80-91%
- C 68-79%
- D 56-67%
- E 50-55%
- Fx 45-49%
- F 0-44%
Namn | Project | Data | Paper |
Axelsson, Linnea | membrane proteins (both types) (4 state) |
alpha_beta_globular_sp_4state.txt
|
Proteome Res. 2008 Feb;7(2):487-96. Enhanced membrane protein topology prediction using a hierarchical classification method and a new scoring function. Lo A1, Chiu HS, Sung TY, Lyu PC, Hsu WL. |
Bindu Suresh, Aishwarya | Beta barrel membrane protein (3 state) | beta_globular_3state_m_io_nontm.txt |
Savojardo C, Fariselli P, Casadio R. BETAWARE: A machine-learning
tool to detect and predict transmembrane beta-barrel proteins in prokaryotes. Bioinformatics. 2013;29(4):504–5. |
Boey, Zhong Hao Daryl | Exposed – Burried 2 state | buried-exposed.3line.txt |
Learning protein multi-view features in complex space.
Yu, DJ., Hu, J., Wu, XW. et al. Amino Acids (2013) 44: 1365. https://doi.org/10.1007/s00726-013-1472-6. |
Hallberg, Olaf | |||
Hesselman, Maria Carmen | Buried exposed – Alpha helical membrane proteisn | buried_exposed_alpha.3line.txt |
A sequence-based computational model for the prediction of the solvent accessible surface area for α-helix and β-barrel transmembrane residues http://onlinelibrary.wiley.com/doi/10.1002/jcc.21936/full |
Juszczak, Kajetan | Buried exposed – Beta Barrel membrane proteisn | buried_exposed_beta.3line.txt |
TMBHMM: A frequency profile based HMM for predicting the topology of
transmembrane beta barrel proteins and the exposure status of transmembrane residues https://www.sciencedirect.com/science/article/pii/S1570963911000550 |
Kaldhusdal, Vilde | 3 state secondary structure | dssp_3state.3line.txt |
Prediction of protein secondary structure content by using the concept of Chou’s pseudo amino acid composition and support vector machine.
https://www.ncbi.nlm.nih.gov/pubmed/19149669 |
Kiik, Helen | 8 state secondary structure | dssp_8_state.3line.txt |
SPSSM8: An accurate approach for predicting eight-state secondary structures of proteins.
https://www.sciencedirect.com/science/article/pii/S0300908413003210 |
Kimler, Kyle | Signal Peptide (2 state) | globular_signal_peptide_2state.txt |
“DeepSig: deep learning improves signal peptide detection in proteins” Savojardo et al 2017 Bioinformatics
|
Krali, Olga (u2376) | Signal Peptide + TM (3 state) | globular_signal_tm_3state.txt |
Tsirigos, K., D. et al. (2015). The TOPCONS web server for consensus prediction of membrane protein
topology and signal peptides. Nucleic Acids Research, 43: W401-W407. doi: 10.1093/nar/gkv485. |
Kumar, Sharmishtaa | Alpha helical TM proteins (3 state) | membrane-alpha.3line.txt |
The MEMPACK alpha-helical transmembrane protein structure prediction server:
https://academic.oup.com/bioinformatics/article/27/10/1438/259347 |
Lautenbach, Maximilian Julius | Alpha helical TM proteins (2 state) | membrane-alpha_2state.3line.txt |
“TMSEG: novel prediction of transmembrane helices” – Michael Bernhofer, Edda Kloppmann, Jonas Reeb, and Burkhard Rost; November 2016 in Proteins (doi: 10.1002/prot.25155)
|
Martinez Hernandez, Marina | Beta barrel (2 state) | membrane-beta_2state.3line.txt |
BOCTOPUS: improved topology prediction of transmembrane β barrel proteins. Sikander Hayat and Arne Elofsson. Structural bioinformatics. Vol. 28 no. 4 2012, pages 516–522 doi:10.1093/bioinformatics/btr710. Link: https://academic.oup.com/bioinformatics/article/28/4/516/213408
|
Pérez Gómez, Fernando | Beta barrel (3 state) | membrane-beta_3state.3line.txt |
Tian, Wei, et al. “High-Resolution Structure Prediction β-Barrel Membrane Proteins.” Proceedings of
the National Academy of Sciences, vol. 115, no. 7, 2018, pp. 1511–1516., doi:10.1073/pnas.1716817115. http://www.pnas.org/content/pnas/115/7/1511.full.pdf |
Rodriguez, Lucia | Beta barrel (4 state) | membrane-beta_4state.3line.txt |
Topology of membrane proteins-predictions, limitations and variations
https://www.ncbi.nlm.nih.gov/pubmed/29100082 |
Ropat, Maryia | 3 state secondary structure | stride_3state.3line.txt |
A Novel Method for Protein Secondary Structure Prediction
Using Dual-Layer SVM and Profiles https://www.ncbi.nlm.nih.gov/pubmed/14997569 doi:10.1002/prot.10634 |
Rosa, André | Transmembrane alpha and Beta proteins |
tm_alpha_beta_3state.txt
|
PredβTM: A Novel β-Transmembrane Region Prediction Algorithm by Roy Choudhury A et al. (2015).
Transmembrane region predictor for beta-barrel proteins. |
Senftleben, Maximilian Lukas | Trasmembrane alpha, beta and globular | tm_globular_3state.txt |
Transmembrane protein topology prediction using support vector machines.
Nugent T1, Jones DT. |
Valiukonyte, Milda | Buried and exposed residues in membrane proteins | buried_exposed_alpha+beta.3line.txt |
Helms, V., Hayat, S., & Metzger, J. (2010). Predicting the burial/exposure status of transmembrane residues in helical membrane proteins. In Structural Bioinformatics of Membrane Proteins (pp. 151-164). Springer, Vienna.
|
Xi, Yuanyuan | 3 state secondary structure | cas1.fasta |
Magnan, C.N. and Baldi, P., 2014. SSpro/ACCpro 5:
almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity. Bioinformatics, 30(18), pp.2592-2597. https://www.sciencedirect.com/science/article/pii/S0022283601945802 |
Xu, Fuqi | 3 state secondary structure | cas2.fasta |
Kim, Hyunsoo, and Haesun Park. “Protein secondary structure prediction based on an improved support vector machines approach.” Protein Engineering 16.8 (2003): 553-560.
https://scholar.google.se/scholar?hl=en&as_sdt=0%2C5&as_vis=1&q=Protein+secondary+structure+prediction+based+on+an+improved+support+vector+machines+approach.&btnG= |
Xu, Shuhan | 3 state secondary structure | cas3.fasta |
Wang, Sheng, et al. “Protein secondary structure prediction using deep convolutional neural fields.” Scientific reports 6 (2016): 18962.
|
Yang, Ke | |||
Zhang, Youcheng | Signal Peptide Gram -negativ | gram-signal.txt |
Petersen, T. N., Brunak, S., von Heijne, G. & Nielsen, H. SignalP 4.0: discriminating signal peptides
from transmembrane regions. Nature methods 8, 785-786, doi:10.1038/nmeth.1701 (2011). |
Özgün, Ceren | Secondary structure | jpred1.fasta | Protein secondary structure prediction using a small training set (compact model) combined with a Complex-valued neural network approach S. Rashid, S. Saraswathi, A. Kloczkowski, S. Sundaram and A. Kolinski BMC BioinformaticsBMC series – open, inclusive and trusted201617:362 https://doi.org/10.1186/ |
Furugård, Cecilia | Signal Peptides in Eukaryotes | euk-signal.txt |
Nielsen H, Engelbrecht J, Brunak S, von Heijne G. 1997. Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Engineering, Design and Selection 10: 1–6.
|
Paz Barba, Miriam | Signal peptide gram positive | gram+-signal.txt |
Protein secretion and surface display in
Gram-positive bacteria. Olaf Schneewind and Dominique M. Missiakas. Phil. Trans. R. Soc. B (2012) 367, 1123–1139 DOI: 10.1098/rstb.2011.0210 |
Ramakant Mishra(u2375) | Secondary structure preriction | jpred2.fasta |
Heffernan et al.. 2017: Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility : https://academic.oup.com/bioinformatics/article/33/18/2842/3738544
|
- Title
- Author
- Abstract
- Introduction
- Results
- Discussion
- Conclusions
- References
- Week 1 (Day 1-2)
- Bash etc http://swcarpentry.github.io/shell-novice/
- Git http://swcarpentry.github.io/git-novice
- Python http://swcarpentry.github.io/python-novice-inflammation/
- Linux Tips and Tricks
- How to organize your project
- Compulsory exercises
- Write a bash script that when run creates a template project folder structure(see above on tips on how to organize your project).
- Create a new repo on github (sign up if you do not already have an account) and push your file to this repo. Send a link to the repo to john.lamb@scilifelab.se.
- Friday: Work on your own
- Friday: Elofsson group meeting @scilifelab
- Week 2
- Mon Project help
- Mon Finished the Software carpentry tasks. Show assistants (see above under Week 1).
- Tue: Project start
- Wed: Project help
- Thu: Work on your own
- Fri: Journal club @scililelab
- Friday: Elofsson group meeting @scilifelab
- Fri: Submit list of 5 relevant papers to present to arne@bioinfo.se
- Fri: One page project plan submitted as a PDF to elofsson.arne.su@analys.urkund.se
- Week 3
- Mon:Receive list of paper to present
- Goals of the week:
- Demonstration of program that can create input to SKlearn
- Practice paper presentation in front of your colleagues
- Mon: Project help
- Wed: Project help
- Fri: submit “week 3 report”:
- Check in a program on github that can take a sequence a feature and create an input to sklearn and predict your feature from a single sequence using one sequence
- Submit the evaluation of the 4 presentations (by your colleagues) that you have done.
- Diary
- Fri: Journal club @scililelab
- Friday: Elofsson group meeting @scilifelab
- Week 4
- Goals:
- Ensure that your predictor predicts the output.
- Paper presentation
- Mon: Project help
- Wed: Project help
- Thu: Paper presentations
- Fri: Journal club @scililelab
- Friday: Elofsson group meeting @scilifelab
- Fri:Week 4 report
- One page self-evaluation of your presentation including the written comments by your peers submitted to elofsson.arne.su@analys.urkund.se
- Provide an optimized version of your program using sklearn.
- Goals:
- Week 5
- Goals:
- Final predictor
- Final report
- Mon: Project help
- Tue: Almost final version of the predictor working and available at your github account.
- Wed: Project help – last chance to get feedback from TAs
- Fri: Journal club @scililelab
- Friday: Elofsson group meeting @scilifelab
- Goals:
- Week 6
- Monday: Report submitted as a 5-10 page PDF sent to elofsson.arne.su@analys.urkund.se
- Monday: Final predictor uploaded to github
- Jones DT. (1999) Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292: 195-202.
- Noble WS (2009) A Quick Guide to Organizing Computational Biology Projects. PLoS Comput Biol 5(7): e1000424.
- Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin. A Practical Guide to Support Vector Classification. http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
General information
The report should be sent by email to elofsson.arne.su@urkund.se
Reporting
All reporting are made online. If your answers are satisfactory, you will receive a mail in return saying that you passed the practical in question, otherwise you will receive a mail stating which answers require more work. You need then to resubmit your report, clearly indicating the corrections you made to those questions.
Originality
You may work together in small groups, but every student must write his/her own report. You may not copy each other’s reports. To do so, or to make a copy, then change a few words, will be considered cheating. The purpose of the course is to ensure that, after having gone through it, every student will possess the necessary skills to work independently with these tools in later courses.
Please note that all lab reports are checked for plagiarism, so you cannot copy text from anywhere. Failing these tests might result in disciplinary punishments.
In order to help you with that you should go through the tutorial that you find using the following link:
http://www.ub.gu.se/ref/Refero_eng/1intro.php
It contains information on plagiarism and how to avoid it. If you have questions please contact your teacher.
Schedule
List of paper to present
ARANDA GUILLÉN MARIA ISABEL
ARROYO GOMEZ JAVIER
Prediction of the burial status of transmembrane residues of helical membrane proteins https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-8-302
CHALLOORI MOUNIKA
Simultaneous prediction of protein secondary structure and transmembrane spans
Julia Koehler Leman, Ralf Mueller, Mert Karakas, Nils Woetzel, and Jens Meiler 10.1002/prot.24258
ESTER MANUEL
Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4707437/
GUPTA REVANT
Taigang Liu, Xiaoqi Zheng, Jun Wang, Prediction of protein structural class for low-similarity
sequences using support vector machine and PSI-BLAST profile, Biochimie, Volume 92, Issue 10,
October 2010, Pages 1330-1334, ISSN 0300-9084,
KYRIAKIDIS VASILIEIOS-EVRIPIDIS
Evaluation of transmembrane helix predictions in 2014.
Proteins. 2015 Mar;83(3):473-84. doi: 10.1002/prot.24749. Epub 2015 Jan 22.
Reeb J, Kloppmann E, Bernhofer M, Rost B.
HE TIANLIN
Prediction of protein relative solvent accessibility with support vector machines and long-range interaction 3D local descriptor
LANILLOS JAVIER
Prediction of protein solvent accessibility using support vector machines.
Z Yuan, K Burrage, JS Mattick – Proteins: Structure, Function, …, 2002 – Wiley Online
Library: http://onlinelibrary.wiley.com/doi/10.1002/prot.10176/full
LAWSON Ryno
LEE KA WAI
Hayat, S. and Elofsson, A. (2012) BOCTOPUS: improved topology prediction of transmembrane beta barrel proteins. Bioinformatics 28 (4) : 516-522.
LÖVERLI ELINOR
Nugent T, Jones DT. Transmembrane protein topology prediction using support vector machines.
BMC Bioinformatics. 2009.
NORMARK TANJA
ROSA ANDRÉ
SAVATIER-DUPRÉ BAÑARES CAROLINA
Thomas Nordahl Petersen, Søren Brunak, Gunnar von Heijne & Henrik Nielsen (2011).
SignalP 4.0: discriminating signal peptides from transmembrane regions.
Link: http://www.nature.com/nmeth/journal/v8/n10/pdf/nmeth.1701.pdf
TOMAR SIDDHARTH
TMBHMM: a frequency profile based HMM for predicting the topology of transmembrane beta barrel proteins
and the exposure status of transmembrane residues.
doi: 10.1016/j.bbapap.2011.03.004
WIRJOWERDOJO DIMITRI ALVIN
Rashid, Shamima, Saras Saraswathi, Andrzej Kloczkowski, Suresh Sundaram, and Andrzej
Kolinski, ‘Protein Secondary Structure Prediction Using a Small Training Set (Compact
Model) Combined with a Complex-Valued Neural Network Approach’, BMC
Bioinformatics, 2016, 1–18 <http://dx.doi.org/10.1186/s12859-016-1209-0>
VON BERLIN LEONIE
Kazemian, Hassan B., Kenneth White, and Dominic Palmer-Brown. “Applications of evolutionary
SVM to prediction of membrane alpha-helices.” Expert systems with applications 40.9 (2013): 3412-
3420.
XUEQING WANG
Magnan, C. N. & Baldi, P. SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity. Bioinformatics 30, 2592-2597 (2014)
Updated articles
NORMARK TANJA
Predicting the Solvent Accessibility of Transmembrane Residues from Protein Sequence. Z. Yuan et al.
ARANDA GUILLÉN MARIA ISABEL
PredβTM: A Novel β-Transmembrane Region Prediction Algorithm. Roy Choudhury A , Novič M PLoS One. 2015
ROSA ANDRÉ
Lukas Käll, Anders Krogh and Erik L. L. Sonnhammer.
Advantages of combined transmembrane topology and signal peptide prediction–the Phobius web server
Nucleic Acids Res., 35:W429-32, July 2007
And for Ryno Lawson
Improving transmembrane protein consensus topology prediction using inter-helical interaction