{"componentChunkName":"component---src-templates-blog-post-tsx","path":"/gsoc-ideas-2020","result":{"data":{"markdownRemark":{"id":"44c47f5c-a2ec-58bf-80c2-e1da489d85a3","excerpt":"<p>Announcement, announcement! After a successful experience with\n<a href=\"https://developers.google.com/season-of-docs\">Google Season of Docs</a> in 2019,\nwe’re putting out a call for students to apply…</p>","html":"<p>Announcement, announcement! After a successful experience with\n<a href=\"https://developers.google.com/season-of-docs\">Google Season of Docs</a> in 2019,\nwe’re putting out a call for students to apply to work with DVC as part of\n<a href=\"https://summerofcode.withgoogle.com/\">Google Summer of Code</a>. If you want to\nmake a dent in open source software development with mentorship from our team,\nread on.</p>\n<h2>Prerequisites to apply</h2>\n<p>Besides the general requirements to apply to Google Summer of Code, there are a\nfew skills we look for in applicants.</p>\n<ol>\n<li><strong>Python experience.</strong> All of our core development is done in Python, so we\nprefer candidates that are experienced in Python. However, we will consider\napplicants who are very strong in another language and familiar with Python\nbasics.</li>\n<li><strong>Git experience.</strong> Git is also a key part of DVC development, as DVC is\nbuilt around Git; that said, for certain projects (rated as “Beginner”) a\nsurface-level knowledge of Git will be sufficient.</li>\n<li><strong>People skills.</strong> Beyond technical fundamentals, we put a high value on\ncommunication skills: the ability to report and document your experiments and\nfindings, to work kindly with teammates, and explain your goals and work\nclearly.</li>\n</ol>\n<p>If you like our mission but aren’t sure if you’re sufficiently prepared, please\nbe in touch anyway. We’d love to hear from you.</p>\n<h2>Project ideas</h2>\n<p>Below are several project ideas that are an immediate priority for the core DVC\nteam. Of course,we welcome students to create their own proposals, even if they\ndiffer from our ideas. Projets will be primarily mentored by co-founders\n<a href=\"https://github.com/dmpetrov\">Dmitry Petrov</a> and\n<a href=\"https://github.com/shcheklein\">Ivan Shcheklein</a>.</p>\n<ol>\n<li><strong>Migrate to the latest v3 API to improve Google Drive support.</strong> Our\norganization is a co-maintainer of the PyDrive library in collaboration with\na team at Google. The PyDrive library is now several years old and still\nrelies on the v2 protocol. We would like to migrate to v3, which we expect\nwill boost performance for many DVC use cases (e.g. the ability to filter\nfields being retrieved from our API, etc). For this project, we’re looking\nfor a student to work with us to prepare the next major version of the\nPyDrive library, as well as making important changes to the core DVC code to\nsupport it. Because PyDrive is broadly used outside of DVC, this project is a\nchance to work on a library of widespread interest to the Python community.\n<html><head></head><body><br></body></html> <html><head></head><body><br></body></html> <em>Skills required:</em> Python, Git, experience with APIs <html><head></head><body><br></body></html>\n<em>Difficulty rating:</em> Beginner-Medium <html><head></head><body><br></body></html></li>\n<li><strong>Introducing parallelism to DVC.</strong> One of DVC’s features is the ability to\ncreate pipelines, linking data repositories with code to process data, train\nmodels, and evaluate model metrics. Once a DVC pipeline is created, the\npipeline can be shared and re-run in a systematic and entirely reproducible\nway. Currently, DVC executes pipelines sequentially, even though some steps\nmay be run in parallel (such as data preprocessing). We would like to support\nparallelization for pipeline steps specified by the user. Furthermore, we’ll\nneed to support building flags into DVC commands that specify the level of\nparallelization (CPU, GPU or memory). <html><head></head><body><br></body></html> <html><head></head><body><br></body></html> <em>Skills required:</em>\nPython, Git. Some experience with parallelization and/or scientific computing\nwould be helpful but not required. <html><head></head><body><br></body></html> <em>Difficulty rating:</em> Advanced\n<html><head></head><body><br></body></html></li>\n<li><strong>Developing use cases for data registries and ML model zoos.</strong> A new DVC\nfunctionality that we’re particularly excited about is <html><head></head><body><code class=\"language-text\">summon</code></body></html>, a method\nthat can turn remotely-hosted machine learning artifacts such as datasets,\ntrained models, and more into objects in the user’s local environment (such\nas a Jupyter notebook). This is a foundation for creating data catalogs of\ndata-frames and machine learning model zoos on top of Git repositories and\ncloud storages (like GCS or S3). We need to identify and implement model zoos\n(think PyTorch Hub, the Caffe Model Zoo, or the TensorFlow DeepLab Model Zoo)\nand data registries for types that are not supported by DVC yet. Currently,\nwe’ve tested <html><head></head><body><code class=\"language-text\">summon</code></body></html> with PyTorch image segmentation models and Pandas\ndataframes. We’re looking for students to explore other possible use cases.\n<html><head></head><body><br></body></html> <html><head></head><body><br></body></html> <em>Skills required:</em> Python, Git, and some machine learning or\ndata science experience <html><head></head><body><br></body></html> <em>Difficulty rating:</em> Beginner-Medium <html><head></head><body><br></body></html></li>\n<li><strong>Continuous delivery for JetBrains TeamCity.</strong> Continuous integration and\ncontinuous delivery (CI/CD) for ML projects is an area where we see\n<a href=\"https://martinfowler.com/articles/cd4ml.html\">DVC make a big impact</a>-\nspecifically, by delivering datasets and ML models into CI/CD pipelines.\nWhile there are many cases when DVC is used inside GitHub Actions and GitLab\nCI, you will be transferring this experience to another type of CI/CD system,\n<a href=\"https://www.jetbrains.com/teamcity/\">JetBrains TeamCity</a>. We’re working to\nintegrate DVC’s model and dataset versioning into TeamCity’s CI/CD toolkit.\nThis project would be ideal for a student looking to explore the growing\nfield of MLOps, an offshoot of DevOps with the specifics of ML projects at\nthe center. <html><head></head><body><br></body></html> <html><head></head><body><br></body></html> <em>Skills required:</em> Python, Git, bash scripting. It\nwould be nice, but not necessary, to have some experience with CI/CD tools\nand developer workflow automation. <html><head></head><body><br></body></html> <em>Difficulty rating:</em>\nMedium-Advanced <html><head></head><body><br></body></html></li>\n<li><strong>DVC performance testing framework.</strong> Performance is a core value of DVC. We\nwill be creating a performance monitoring and testing framework where new\nscenarios (e.g., unit testing)can be populated. The framework should reflect\nall performance improvements and degradations for each of the DVC releases.\nIt would be especially compelling if testing could be integrated with our\nGitHub workflow (CI/CD). This is a great opportunity for a student to learn\nabout DVC and versioning in-depth and contribute to its stability. <html><head></head><body><br></body></html>\n<html><head></head><body><br></body></html> <em>Skills required:</em> Python, Git, bash scripting. <html><head></head><body><br></body></html> <em>Difficulty\nrating:</em> Medium-Advanced <html><head></head><body><br></body></html></li>\n</ol>\n<h2>If you’d like to apply</h2>\n<p>Please refer to the\n<a href=\"https://summerofcode.withgoogle.com/\">Google Summer of Code</a> application guides\nfor specifics of the program. Students looking to know more about DVC, and our\nworldwide community of contributors, will learn most by visiting our\n<a href=\"https://dvc.org/chat\">Discord channel</a>,\n<a href=\"https://github.com/iterative/dvc\">GitHub repository</a>, and\n<a href=\"https://discuss.dvc.org/\">Forum</a>. We are available to discuss project proposals\nfrom interested students and can be reached by <a href=\"support@dvc.org\">email</a> or on\nour Discord channel.</p>","timeToRead":4,"fields":{"slug":"/gsoc-ideas-2020"},"frontmatter":{"title":"Join DVC for Google Summer of Code 2020","date":"February 04, 2020","description":"A call for student applications for Google Summer of Code 2020.\n","descriptionLong":"DVC is looking for students to take part in Google Summer of Code 2020.\n","tags":["Google Summer of Code","DVC","Students","Mentoring"],"commentsUrl":"https://discuss.dvc.org/t/join-dvc-for-google-summer-of-code/317","author":{"childMarkdownRemark":{"frontmatter":{"name":"Elle O'Brien","avatar":{"childImageSharp":{"fixed":{"base64":"data:image/jpeg;base64,/9j/2wBDABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkzODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2P/2wBDARESEhgVGC8aGi9jQjhCY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2P/wgARCAAUABQDASIAAhEBAxEB/8QAGQABAAIDAAAAAAAAAAAAAAAAAAMFAgQG/8QAFQEBAQAAAAAAAAAAAAAAAAAAAgP/2gAMAwEAAhADEAAAAZmtNOlyjIcrZgpiEP/EABsQAQACAgMAAAAAAAAAAAAAAAIBAxIhABEz/9oACAEBAAEFArV0dVzy942N41GdSpSt8A0D/8QAFxEAAwEAAAAAAAAAAAAAAAAAAQIgIf/aAAgBAwEBPwFRkf/EABURAQEAAAAAAAAAAAAAAAAAAAEg/9oACAECAQE/AWP/xAAfEAACAQIHAAAAAAAAAAAAAAAAARACEQMSUVJhcYH/2gAIAQEABj8CS3MzUexhvQ7h3FwWTP/EAB0QAAMBAAEFAAAAAAAAAAAAAAABESFRMUFxsdH/2gAIAQEAAT8hsbi0nApKKKoujQjiDTN+15X0ovgcWhPSElmTuf/aAAwDAQACAAMAAAAQHOi9/8QAGBEAAwEBAAAAAAAAAAAAAAAAAAExEBH/2gAIAQMBAT8QRRwUz//EABcRAAMBAAAAAAAAAAAAAAAAAAEQITH/2gAIAQIBAT8QOo6v/8QAHRABAAMBAAIDAAAAAAAAAAAAAQARITFxkUFhsf/aAAgBAQABPxB89x9U6Sn6EA69v7iEs5LB0aDWr38jGAHlgPfSAs0pXqIzhQ8QFPkh4ORpypLXVhnif//Z","width":40,"height":40,"src":"/static/1614906361c7d460137741db062e0c7e/d83e5/elle_obrien.jpg","srcSet":"/static/1614906361c7d460137741db062e0c7e/d83e5/elle_obrien.jpg 1x,\n/static/1614906361c7d460137741db062e0c7e/58860/elle_obrien.jpg 1.5x,\n/static/1614906361c7d460137741db062e0c7e/90ac5/elle_obrien.jpg 2x","srcWebp":"/static/1614906361c7d460137741db062e0c7e/e145b/elle_obrien.webp","srcSetWebp":"/static/1614906361c7d460137741db062e0c7e/e145b/elle_obrien.webp 1x,\n/static/1614906361c7d460137741db062e0c7e/0d42c/elle_obrien.webp 1.5x,\n/static/1614906361c7d460137741db062e0c7e/f46db/elle_obrien.webp 2x"}}}}}},"picture":{"childImageSharp":{"fluid":{"base64":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAPCAIAAABr+ngCAAAACXBIWXMAAAsSAAALEgHS3X78AAADgUlEQVQozwXBWVMaBwAA4P1NnelTxk47bVLbzjQWqkUEgQpyLMshcizLcuxyLKfIsSyLsqAoKmqCRCSySyCx9UJEMYk1Xu1M27f+hn4fEEdnKL96N6NOIgIM4mezwWDQ4bWrMedsPGQnCHcgFgrMRWZAiVzIl/3CU4l4CZckZBKETUJg3gmuU/BuWkq5hC7N8wqN5FYYHyyPuMHMZmX19DK9tjYza5CKxWMjvMlR3ox8lLCKSyHZglcLpN3ajZhyK2eng3q36rvcYrTYe4/awPRqkfnj38LggVwpIrABUkzC2gkUHINVPFjFDyOSpFMFkBFnvpAiGSrFZALqp3PZaOL2P6JYCOWozMe/qf5djGH8mC2GToUtArOCZ5ji62V8nYyPGyVAcKmQLi/HyuVgoUD7tWRti7j6J9K9jr7tzvUf4oM/E63jdDYSx7U2jcAyzbeqxrTS0RmlAIVEgPfoPXFw5ntzhDffRZpt/PTafXbrOfuE9+7ws0+B87sE25nL09HMfByRuqAxs1ro0otwg9ABiQFH7xa5eHRc/mU7v7f0H63n93D/Abl4QC8fkfN7tP9AslyB9OQTsNsyZVXyTfKf9bIRXxjD/S7A3znSHQyU9QPd8TXY7qlfH8q32cnSzhi5OrXNKXY6I96kGAsbHBaBSvG9cHJ4QjI0rjDRy+mkD8icHE9ThSc65zM/OQShTyDnEBz9XO38TIF8m6wIy02+1jwbS9pxL2RHjGbI6XMQhD6ECsvzIiDFtlIk4cGs4uzG13sXz/e7IwcfxrnfJ9i3E1zHWH9RWsOolCYQUVQq2m4H7LRn2ba5xepa7xCg2vD1OEO/o+M+koX7lcVBPvHSY9zNENXQ+joIrcbhKgW9XA6HNKGoJZ1xUYSWsMuiPkM65gS6LePKsmF/y3zZgEu0uZwzryVBrFlH914fbmJxrjbUvtEv0tbk/HQganQ5YolIOOzIY8JcIQQgFgPl1WyE5dW0rkqbsu7J1ZQlt82o6xxeKbnZhvxV3cRkv2lfSZgVTTaHMEVzrUHSnoUcDiwEJC8SSiagiHrUW3m0QUMJ0h9MhZWvmr+yv33R+fAD1x3mes+aJ+MbNdH2nnKzpq82wU7X1uCAnH+6ktTXSHAnAy7NG3bp2egSJeQOpeyh7OBi6M3lV+3Bl52rp1xvuHn60/7xj+wpVtlWntyIjm/+B33U91Mw75ALAAAAAElFTkSuQmCC","aspectRatio":1.3333333333333333,"src":"/static/1fcd448df146351e7f11de97c530561a/286b3/Summer_of_Code_small.png","srcSet":"/static/1fcd448df146351e7f11de97c530561a/1f44b/Summer_of_Code_small.png 213w,\n/static/1fcd448df146351e7f11de97c530561a/3e433/Summer_of_Code_small.png 425w,\n/static/1fcd448df146351e7f11de97c530561a/286b3/Summer_of_Code_small.png 850w,\n/static/1fcd448df146351e7f11de97c530561a/9a739/Summer_of_Code_small.png 1275w,\n/static/1fcd448df146351e7f11de97c530561a/c47cc/Summer_of_Code_small.png 1700w,\n/static/1fcd448df146351e7f11de97c530561a/c5a1b/Summer_of_Code_small.png 2000w","srcWebp":"/static/1fcd448df146351e7f11de97c530561a/5c1d9/Summer_of_Code_small.webp","srcSetWebp":"/static/1fcd448df146351e7f11de97c530561a/99b2d/Summer_of_Code_small.webp 213w,\n/static/1fcd448df146351e7f11de97c530561a/23220/Summer_of_Code_small.webp 425w,\n/static/1fcd448df146351e7f11de97c530561a/5c1d9/Summer_of_Code_small.webp 850w,\n/static/1fcd448df146351e7f11de97c530561a/5e720/Summer_of_Code_small.webp 1275w,\n/static/1fcd448df146351e7f11de97c530561a/35cfd/Summer_of_Code_small.webp 1700w,\n/static/1fcd448df146351e7f11de97c530561a/37117/Summer_of_Code_small.webp 2000w","sizes":"(max-width: 850px) 100vw, 850px","presentationWidth":850}}},"pictureComment":null}}},"pageContext":{"next":{"fields":{"slug":"/february-20-dvc-heartbeat"},"frontmatter":{"title":"February '20 DVC❤️Heartbeat"}},"previous":{"fields":{"slug":"/january-20-community-gems"},"frontmatter":{"title":"January '20 Community Gems"}},"currentPage":2,"slug":"/gsoc-ideas-2020"}}}