{"componentChunkName":"component---src-templates-blog-post-tsx","path":"/may-19-dvc-heartbeat","result":{"data":{"markdownRemark":{"id":"03b08ad6-3eee-5ccc-9ce6-d9ee51186751","excerpt":"<h2>News and links</h2>\n<p>This section of DVC Heartbeat is growing with every new Issue and this is\nalready quite a good piece of news!</p>\n<p>One of the most exciting things we want to share this month is acceptance of DVC\ninto the <a href=\"https://developers.google.com/season-of-docs/\">Google Season of Docs</a>.\nIt is a new and unique program sponsored by Google that pairs technical writers\nwith open source projects to collaborate and improve the…</p>","html":"<h2>News and links</h2>\n<p>This section of DVC Heartbeat is growing with every new Issue and this is\nalready quite a good piece of news!</p>\n<p>One of the most exciting things we want to share this month is acceptance of DVC\ninto the <a href=\"https://developers.google.com/season-of-docs/\">Google Season of Docs</a>.\nIt is a new and unique program sponsored by Google that pairs technical writers\nwith open source projects to collaborate and improve the open source project\ndocumentation. You can find the outline of DVC vision and project ideas in\n<a href=\"https://blog.dataversioncontrol.com/dvc-project-ideas-for-google-summer-of-docs-2019-defe3a73b248\">this dedicated blogpost</a>\nand check the\n<a href=\"https://developers.google.com/season-of-docs/docs/participants/\">full list of participating open source organizations</a>.\nTechnically the\n<a href=\"https://developers.google.com/season-of-docs/docs/timeline\">program is starting in a few months</a>,\nbut there is already a fantastic increase in the amount of commits and\ncontributors, and we absolutely love it!</p>\n<p>The other important milestone for us was the first offline meeting with our\ndistributed remote team. Working side by side and having non-Zoom meetings with\nthe team was amazing. Joining our forces to prepare for the upcoming conferences\nturned out to be the most valuable, educating and uniting experience for the\nwhole team.</p>\n<p>It’s a shame that our tech lead was unable to join us it due to another visa\ndenial. We do hope he will finally make it to the USA for the next big\nconference.</p>\n<p><html><head></head><body><span class=\"gatsby-resp-image-wrapper\" style=\"position: relative; display: block; margin-left: auto; margin-right: auto;  max-width: 700px;\">\n      <a class=\"gatsby-resp-image-link\" href=\"/static/060f8f204b833689b1569a4162d67e3d/6d894/the-world-is-changing.png\" style=\"display: block\" target=\"_blank\" rel=\"noopener\">\n    <span class=\"gatsby-resp-image-background-image\" style=\"padding-bottom: 58.801955990220044%; position: relative; bottom: 0; left: 0; background-image: url(&#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAMCAIAAADtbgqsAAAACXBIWXMAAAsSAAALEgHS3X78AAAB+UlEQVQoz3VSiW7aQBT0L7QY0xASzA0xCYe9mCOcpsQ4uIGkVdKqPggYjA0BQwqNFKnqn/cZQ0SlVhpZs7M7b988LxbgDEpYxFpT1FsH6xNAvDWjb54z4orqWHhOdf8f2ElND3NmpGmmxCVZ0/2VcbhhXHSsZHsR5+dvZnyLQ2Kb3Uh+T3/HkeSiJZyRYIlvFQdETvGwCpGTCdYm7pwM8LB7c6ypZcRZlNNizdE5P0l8HEcaGnxT19MLwaT4SRT0tglKvDkCMckbUW54XFRxZJtH2e4m1npKCha6+0m1rcTV/Ky9oLtrurdhbjfBugH5U50lxc9hLsUvL6G67i1ApwpGtbSUMM2Ks/ytVX1YJ9sGkExnhm7mbM8qfl6d8RP22yv79ZW5fylJvyrq7/PuKsxPfZyJEawKwdw5Bd+mtYMh2VaQHc+NJCKvkvxTRFgAAlez2PUSkBBXIX6O/T39PiR54wTb300YQWnFFlkVZxSckV2MDGRndjGKvzpKf1rS3R/BxuQdLZFVPS0u0+LKXx3D0Q/5frb7nBQWznli96sObvYWH/2VwW6b7R8V+v7yAIhdHaknl4+e/GFrezN05S0NAzU91NCPS0OnSbI2hnYcDiIM+bQyOioM/mE+LWvRpgmvzVfWHEOYM+C1Oru+Sw1Kk9Wxbd5f/gdV/LcK4QQLOwAAAABJRU5ErkJggg==&#x27;); background-size: cover; display: block;\"></span>\n  <picture>\n        <source srcset=\"/static/060f8f204b833689b1569a4162d67e3d/c54d4/the-world-is-changing.webp 175w, /static/060f8f204b833689b1569a4162d67e3d/a3432/the-world-is-changing.webp 350w, /static/060f8f204b833689b1569a4162d67e3d/426ac/the-world-is-changing.webp 700w, /static/060f8f204b833689b1569a4162d67e3d/c139f/the-world-is-changing.webp 1050w, /static/060f8f204b833689b1569a4162d67e3d/7f403/the-world-is-changing.webp 1400w, /static/060f8f204b833689b1569a4162d67e3d/2ec87/the-world-is-changing.webp 1636w\" sizes=\"(max-width: 700px) 100vw, 700px\" type=\"image/webp\">\n        <source srcset=\"/static/060f8f204b833689b1569a4162d67e3d/17006/the-world-is-changing.png 175w, /static/060f8f204b833689b1569a4162d67e3d/d6f3f/the-world-is-changing.png 350w, /static/060f8f204b833689b1569a4162d67e3d/69344/the-world-is-changing.png 700w, /static/060f8f204b833689b1569a4162d67e3d/b1f9d/the-world-is-changing.png 1050w, /static/060f8f204b833689b1569a4162d67e3d/3fc71/the-world-is-changing.png 1400w, /static/060f8f204b833689b1569a4162d67e3d/6d894/the-world-is-changing.png 1636w\" sizes=\"(max-width: 700px) 100vw, 700px\" type=\"image/png\">\n        <img class=\"gatsby-resp-image-image\" src=\"/static/060f8f204b833689b1569a4162d67e3d/69344/the-world-is-changing.png\" alt=\"the world is changing\" title=\"the world is changing\" loading=\"lazy\" style=\"width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0;\">\n      </picture>\n  </a>\n    </span></body></html></p>\n<p>While we were busy finalizing all the PyCon 2019 prep, our own\n<a href=\"https://twitter.com/FullStackML\">Dmitry Petrov</a> flew to New York to speak at\nthe\n<a href=\"https://conferences.oreilly.com/artificial-intelligence/ai-ny\">O’Reilly AI Conference</a>\nabout the\n<a href=\"https://www.oreilly.com/library/view/artificial-intelligence-conference/9781492050544/video324691.html\">Open Source tools for Machine Learning Models and Datasets versioning</a>.\nUnfortunately the video is available for the registered users only (with a free\ntrial option) but you can have a look at Dmitry’s slides\n<a href=\"https://www.slideshare.net/DmitryPetrov15/dvc-oreilly-artificial-intelligence-conference-2019-new-york\">here</a>.</p>\n<p><html><head></head><body><span class=\"gatsby-resp-image-wrapper\" style=\"position: relative; display: block; margin-left: auto; margin-right: auto;  max-width: 404px;\">\n      <a class=\"gatsby-resp-image-link\" href=\"/static/bee9b4ed9981db1bf7eb9db8450fc8d1/38b39/iterative-ai-twitter.png\" style=\"display: block\" target=\"_blank\" rel=\"noopener\">\n    <span class=\"gatsby-resp-image-background-image\" style=\"padding-bottom: 25.247524752475247%; position: relative; bottom: 0; left: 0; background-image: url(&#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAFCAIAAADKYVtkAAAACXBIWXMAAAsSAAALEgHS3X78AAAA2klEQVQY02VQ226DMAzl/z9p0l76tKcxKYxEo0xlpWkLTQm5X0iYNdaHaUeWjmX52McujI/BO29t237WGDdNczz215FdhtvAJgipDZtmPsuU8voXxW5Pd0/Puj+gmpTlW1W9v5YlPV+Usc4aQIzROr+ktP5DceXygzROiRoTTAgsRwh9dZ1UpuOZijSY9URvpKVVfWCzBU1+OCg2yjnf75OUSgghgbSajBh1nO3Cw8rHkVWY01NwHprB/2bkVwxqbWx+zIwp9ursktsqi1DiBSVl88/ZPgR4ASTfmIMa4VKNNzMAAAAASUVORK5CYII=&#x27;); background-size: cover; display: block;\"></span>\n  <picture>\n        <source srcset=\"/static/bee9b4ed9981db1bf7eb9db8450fc8d1/c54d4/iterative-ai-twitter.webp 175w, /static/bee9b4ed9981db1bf7eb9db8450fc8d1/a3432/iterative-ai-twitter.webp 350w, /static/bee9b4ed9981db1bf7eb9db8450fc8d1/426ac/iterative-ai-twitter.webp 700w, /static/bee9b4ed9981db1bf7eb9db8450fc8d1/2b269/iterative-ai-twitter.webp 808w\" sizes=\"(max-width: 700px) 100vw, 700px\" type=\"image/webp\">\n        <source srcset=\"/static/bee9b4ed9981db1bf7eb9db8450fc8d1/17006/iterative-ai-twitter.png 175w, /static/bee9b4ed9981db1bf7eb9db8450fc8d1/d6f3f/iterative-ai-twitter.png 350w, /static/bee9b4ed9981db1bf7eb9db8450fc8d1/69344/iterative-ai-twitter.png 700w, /static/bee9b4ed9981db1bf7eb9db8450fc8d1/38b39/iterative-ai-twitter.png 808w\" sizes=\"(max-width: 700px) 100vw, 700px\" type=\"image/png\">\n        <img class=\"gatsby-resp-image-image\" src=\"/static/bee9b4ed9981db1bf7eb9db8450fc8d1/69344/iterative-ai-twitter.png\" alt=\"iterative ai twitter\" title=\"iterative ai twitter\" loading=\"lazy\" style=\"width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0;\">\n      </picture>\n  </a>\n    </span></body></html></p>\n<p>We renamed our Twitter! Our old handle was a bit misleading and we moved from\n@Iterativeai to <a href=\"https://twitter.com/DVCorg\">@DVCorg</a> (yet keep the old one for\nfuture projects).</p>\n<p>Our team is so happy every time we discover an article featuring DVC or\naddressing one of the burning ML issues we are trying to solve. Here are some of\nour favorite links from the past month:</p>\n<ul>\n<li><strong><a href=\"https://www.pythonpodcast.com/data-version-control-episode-206/\">Version Control For Your Machine Learning Projects — Episode 206</a></strong>\nby <strong><a href=\"https://www.linkedin.com/in/tmacey/\">Tobias Macey</a></strong></li>\n</ul>\n<p><html><head></head><body><html><head></head><body><section class=\"elp-content-holder\">\n      <a href=\"https://www.pythonpodcast.com/data-version-control-episode-206/\" class=\"external-link-preview\">\n          <div class=\"elp-description-holder\">\n            <h4 class=\"elp-title\">Version Control For Machine Learning Projects</h4>\n            <div class=\"elp-description\">An interview with the creator of DVC about how it improves collaboration and reduces duplicate effort on data science…</div>\n            <div class=\"elp-link\">pythonpodcast.com</div>\n          </div>\n           <div class=\"elp-image-holder\">\n                <img src=\"/uploads/images/2019-05-21/version-control-for-your-machine-learning-projects.png\" alt=\"Version Control For Machine Learning Projects\">\n            </div>\n      </a>\n    </section>\n    </body></html></body></html></p>\n<blockquote>\n<p>Version control has become table stakes for any software team, but for machine\nlearning projects there has been no good answer for tracking all of the data\nthat goes into building and training models, and the output of the models\nthemselves. To address that need Dmitry Petrov built the Data Version Control\nproject known as DVC. In this episode he explains how it simplifies\ncommunication between data scientists, reduces duplicated effort, and\nsimplifies concerns around reproducing and rebuilding models at different\nstages of the projects lifecycle.</p>\n</blockquote>\n<ul>\n<li><strong>Here is an\n<a href=\"https://towardsdatascience.com/data-version-control-with-dvc-what-do-the-authors-have-to-say-3c3b10f27ee\">article</a>\nby <a href=\"https://medium.com/@faviovazquez\">Favio Vázquez</a> with a transcript of this\npodcast episode.</strong></li>\n</ul>\n<p><html><head></head><body><html><head></head><body><section class=\"elp-content-holder\">\n      <a href=\"https://towardsdatascience.com/data-version-control-with-dvc-what-do-the-authors-have-to-say-3c3b10f27ee\" class=\"external-link-preview\">\n          <div class=\"elp-description-holder\">\n            <h4 class=\"elp-title\">Data version control with DVC. What do the authors have to say?</h4>\n            <div class=\"elp-description\">Data versioning is one of the most ignored features in data science projects, but that has to change. Here I’ll discuss…</div>\n            <div class=\"elp-link\">towardsdatascience.com</div>\n          </div>\n           <div class=\"elp-image-holder\">\n                <img src=\"/uploads/images/2019-05-21/data-version-control-with-dvc.png\" alt=\"Data version control with DVC. What do the authors have to say?\">\n            </div>\n      </a>\n    </section>\n    </body></html></body></html></p>\n<ul>\n<li><strong><a href=\"https://towardsdatascience.com/why-git-and-git-lfs-is-not-enough-to-solve-the-machine-learning-reproducibility-crisis-f733b49e96e8\">Why Git and Git-LFS is not enough to solve the Machine Learning Reproducibility crisis</a></strong></li>\n</ul>\n<p><html><head></head><body><html><head></head><body><section class=\"elp-content-holder\">\n      <a href=\"https://towardsdatascience.com/why-git-and-git-lfs-is-not-enough-to-solve-the-machine-learning-reproducibility-crisis-f733b49e96e8\" class=\"external-link-preview\">\n          <div class=\"elp-description-holder\">\n            <h4 class=\"elp-title\">Why Git and Git-LFS is not enough to solve the Machine Learning Reproducibility crisis</h4>\n            <div class=\"elp-description\">Some claim the machine learning field is in a crisis due to software tooling that’s insufficient to ensure repeatable…</div>\n            <div class=\"elp-link\">towardsdatascience.com</div>\n          </div>\n           <div class=\"elp-image-holder\">\n                <img src=\"/uploads/images/2019-05-21/why-git-and-git-lfs-is-not-enough.jpeg\" alt=\"Why Git and Git-LFS is not enough to solve the Machine Learning Reproducibility crisis\">\n            </div>\n      </a>\n    </section>\n    </body></html></body></html></p>\n<blockquote>\n<p>With Git-LFS your team has better control over the data, because it is now\nversion controlled. Does that mean the problem is solved? Earlier we said the\n“<em>key issue is the training data</em>”, but that was a lie. Sort of. Yes keeping\nthe data under version control is a big improvement. But is the lack of\nversion control of the data files the entire problem? No.</p>\n</blockquote>\n<html><head></head><body><hr></body></html>\n<h2>Discord gems</h2>\n<p>There are lots of hidden gems in our Discord community discussions. Sometimes\nthey are scattered all over the channels and hard to track down.</p>\n<p>We are sifting through the issues and discussions and share with you the most\ninteresting takeaways.</p>\n<h3>Q: This might be <a href=\"https://discordapp.com/channels/485586884165107732/485598848111083531/572960640122224640\">a favourite gem of ours </a> — our engineers are so fast that someone assumed they were bots.</h3>\n<p>We feared that too until we met them in person. They appeared to be real (unless\nbots also love Ramen now)!</p>\n<p><html><head></head><body><span class=\"gatsby-resp-image-wrapper\" style=\"position: relative; display: block; margin-left: auto; margin-right: auto;  max-width: 700px;\">\n      <a class=\"gatsby-resp-image-link\" href=\"/static/4926411413e184b4531924e6c0aeaf02/e0305/bots-also-love-ramen-now.png\" style=\"display: block\" target=\"_blank\" rel=\"noopener\">\n    <span class=\"gatsby-resp-image-background-image\" style=\"padding-bottom: 76.90387016229712%; position: relative; bottom: 0; left: 0; background-image: url(&#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAPCAIAAABr+ngCAAAACXBIWXMAAAsSAAALEgHS3X78AAABfklEQVQoz5WTXUvDMBSG+3f0Qpxd06T5PEmapp3adptzMAYOvVLEW3++pymIV7LBQzmkvJwnp6fZ9/vudNhw3VQCjLaUa8LUmWRfr+NxN8R2jPf9Nq6ktAU9O3y9qO4KGaC1dVwQWTB5QedYe7Dgg+HKMwHnJ6cwIbxkyoeoXU+5JRd1/v54fjlsqKxppcpKX9b58zQc98PjuH3o15TDZZ2vbtkdEc4HZSPOuaDyl3MG5gCsdSCUFdpV0s7888HL6YIpLKXSpl5v9+PQSwjaNohxDRMmJwL7L5PFshTIZFQKNF2mXcjwHRM+dE84cOOitgF8TPlo6864Fnyb6nbCt8q2+3UbashLmS2KilAN5rHkgBs6kbRZKlg6wSfDw1RTAUpjoXFA2TBuQuhowZIbR/JkmOAJ8Yf5CnJe4ez09r57fglm1cQH26whDKiH5qgq7ajdPdZorqCJMW56lxM5T2sK3+RVUUHdYaYDv0KSsMEn5Q53js3a0yEIqf/+Nj/OONkeIe6EBgAAAABJRU5ErkJggg==&#x27;); background-size: cover; display: block;\"></span>\n  <picture>\n        <source srcset=\"/static/4926411413e184b4531924e6c0aeaf02/c54d4/bots-also-love-ramen-now.webp 175w, /static/4926411413e184b4531924e6c0aeaf02/a3432/bots-also-love-ramen-now.webp 350w, /static/4926411413e184b4531924e6c0aeaf02/426ac/bots-also-love-ramen-now.webp 700w, /static/4926411413e184b4531924e6c0aeaf02/c139f/bots-also-love-ramen-now.webp 1050w, /static/4926411413e184b4531924e6c0aeaf02/7f403/bots-also-love-ramen-now.webp 1400w, /static/4926411413e184b4531924e6c0aeaf02/e2173/bots-also-love-ramen-now.webp 1602w\" sizes=\"(max-width: 700px) 100vw, 700px\" type=\"image/webp\">\n        <source srcset=\"/static/4926411413e184b4531924e6c0aeaf02/17006/bots-also-love-ramen-now.png 175w, /static/4926411413e184b4531924e6c0aeaf02/d6f3f/bots-also-love-ramen-now.png 350w, /static/4926411413e184b4531924e6c0aeaf02/69344/bots-also-love-ramen-now.png 700w, /static/4926411413e184b4531924e6c0aeaf02/b1f9d/bots-also-love-ramen-now.png 1050w, /static/4926411413e184b4531924e6c0aeaf02/3fc71/bots-also-love-ramen-now.png 1400w, /static/4926411413e184b4531924e6c0aeaf02/e0305/bots-also-love-ramen-now.png 1602w\" sizes=\"(max-width: 700px) 100vw, 700px\" type=\"image/png\">\n        <img class=\"gatsby-resp-image-image\" src=\"/static/4926411413e184b4531924e6c0aeaf02/69344/bots-also-love-ramen-now.png\" alt=\"bots also love ramen now\" title=\"bots also love ramen now\" loading=\"lazy\" style=\"width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0;\">\n      </picture>\n  </a>\n    </span></body></html></p>\n<h3>Q: <a href=\"https://discordapp.com/channels/485586884165107732/485596304961962003/572974117351849997\">Is this the best way to track data with DVC when code and data are separate?</a> Having being burned by this a couple of times, i.e accidentally pushing large files to GitHub, I now keep my code and data separate.</h3>\n<p>Every time you run <html><head></head><body><code class=\"language-text\">dvc add</code></body></html> to start tracking some data artifact, its path is\nautomatically added to the <html><head></head><body><code class=\"language-text\">.gitignore</code></body></html> file, as a result it is hard to commit\nit to git by mistake — you would need to explicitly modify the <html><head></head><body><code class=\"language-text\">.gitignore</code></body></html>\nfirst. The feature to track some external data is called\n<a href=\"https://dvc.org/doc/user-guide/external-outputs\">external outputs</a> (if all you\nneed is to track some data artifacts). Usually it is used when you have some\ndata on S3 or SSH and don’t want to pull it into your working space, but it’s\nworking even when your data is located on the same machine outside of the\nrepository.</p>\n<h3>Q: <a href=\"https://discordapp.com/channels/485586884165107732/485596304961962003/571342592508428289\">How do I wrap a step that downloads a file/directory into a DVC stage?</a> I want to ensure that it runs only if file has no been downloaded yet</h3>\n<p>Use <html><head></head><body><code class=\"language-text\">dvc import</code></body></html> to track and download the remote data first time and next time\nwhen you do dvc repro if data has changed remotely. If you don’t want to track\nremote changes (lock the data after it was downloaded), use <html><head></head><body><code class=\"language-text\">dvc run</code></body></html> with a\ndummy dependency (any text file will do you do not touch) that runs an actual\nwget/curl to get the data.</p>\n<h3>Q: <a href=\"https://discordapp.com/channels/485586884165107732/485596304961962003/570943786151313408\">How do I show a pipeline that does not have a default Dvcfile?</a> (e.g. I assigned all files names manually with <html><head></head><body><code class=\"language-text\">-f</code></body></html> in the <html><head></head><body><code class=\"language-text\">dvc run</code></body></html> command and I just don’t have <html><head></head><body><code class=\"language-text\">Dvcfile</code></body></html> anymore)</h3>\n<p>Almost any command in DVC that deals with pipelines (set of DVC-files) accepts a\nsingle stage as a target, for example:</p>\n<html><head></head><body><div class=\"gatsby-highlight\" data-language=\"dvc\"><pre class=\"language-dvc\"><code class=\"language-dvc\"><span class=\"token line\"><span class=\"token input\">$ </span><span class=\"token dvc\">dvc pipeline show</span> — ascii model.dvc</span></code></pre></div></body></html>\n<h3>Q: <a href=\"https://discordapp.com/channels/485586884165107732/485596304961962003/570843482218823682\">DVC hangs or I’m getting <html><head></head><body><code class=\"language-text\">database is locked</code></body></html> issue</a></h3>\n<p>It’s a well known problem with NFS, CIFS (Azure) — they do not support file\nlocks properly which is required by the SQLLite engine to operate. The easiest\nworkaround — don’t create a DVC project on network attached partition. In\ncertain cases a fix can be made by changing mounting options, check\n<a href=\"https://discordapp.com/channels/485586884165107732/485596304961962003/570276668694855690\">this discussion</a>\nfor the Azure ML Service.</p>\n<h3>Q: <a href=\"https://discordapp.com/channels/485586884165107732/485596304961962003/570091809594671126\">How do I use DVC if I use a separate drive to store the data and a small/fast SSD to run computations?</a> I don’t have enough space to bring data to my working space.</h3>\n<p>An excellent question! The short answer is:</p>\n<html><head></head><body><div class=\"gatsby-highlight\" data-language=\"dvc\"><pre class=\"language-dvc\"><code class=\"language-dvc\"><span class=\"token comment\"># To move your data cache to a big partition</span>\n<span class=\"token line\"><span class=\"token input\">$ </span><span class=\"token dvc\">dvc cache dir</span> --local /path/to/an/external/partition\n</span>\n<span class=\"token comment\"># To enable symlinks/harldinks to avoid actual copying</span>\n<span class=\"token line\"><span class=\"token input\">$ </span><span class=\"token dvc\">dvc config</span> cache.type reflink, hardlink, symlink, copy\n</span>\n<span class=\"token comment\"># To protect the cache</span>\n<span class=\"token line\"><span class=\"token input\">$ </span><span class=\"token dvc\">dvc config</span> cache.protected <span class=\"token boolean\">true</span></span></code></pre></div></body></html>\n<p>The last one is highly recommended to make links in your working space read-only\nto avoid corrupting the cache. Read more about different link types\n<a href=\"https://dvc.org/doc/user-guide/large-dataset-optimization\">here</a>.</p>\n<p>To add your data first time to the DVC cache, do a clone of the repository on a\nbig partition and run <html><head></head><body><code class=\"language-text\">dvc add</code></body></html> to add your data. Then you can do <html><head></head><body><code class=\"language-text\">git pull</code></body></html>,\n<html><head></head><body><code class=\"language-text\">dvc pull</code></body></html> on a small partition and DVC will create all the necessary links.</p>\n<h3>Q: <a href=\"https://discordapp.com/channels/485586884165107732/485596304961962003/571335064374345749\">Why I’m getting <html><head></head><body><code class=\"language-text\">Paths for outs overlap</code></body></html> error when I run <html><head></head><body><code class=\"language-text\">dvc add</code></body></html> or <html><head></head><body><code class=\"language-text\">dvc run</code></body></html>?</a></h3>\n<p>Usually it means that a parent directory of one of the arguments for <html><head></head><body><code class=\"language-text\">dvc add</code></body></html> /\n<html><head></head><body><code class=\"language-text\">dvc run</code></body></html> is already tracked. For example, you’ve added the whole datasets\ndirectory already. And now you are trying to add a subdirectory, which is\nalready tracked as a part of the datasets one. No need to do that. You could\n<html><head></head><body><code class=\"language-text\">dvc add datasets</code></body></html> or <html><head></head><body><code class=\"language-text\">dvc repro datasets.dvc</code></body></html> to save changes.</p>\n<h3>Q: <a href=\"https://discordapp.com/channels/485586884165107732/485596304961962003/567310354766495747\">I’m getting <html><head></head><body><code class=\"language-text\">ascii codec can’t encode character</code></body></html> error on DVC commands when I deal with unicode file names</a></h3>\n<p><a href=\"https://perlgeek.de/en/article/set-up-a-clean-utf8-environment\">Check the locale settings you have</a>\n(<html><head></head><body><code class=\"language-text\">locale</code></body></html> command in Linux). Python expects a locale that can handle unicode\nprinting. Usually it’s solved with these commands: <html><head></head><body><code class=\"language-text\">export LC_ALL=en_US.UTF-8</code></body></html>\nand <html><head></head><body><code class=\"language-text\">export LANG=en_US.UTF-8</code></body></html>. You can place those exports into <html><head></head><body><code class=\"language-text\">.bashrc</code></body></html> or\nother file that defines your environment.</p>\n<h3>Q: <a href=\"https://discordapp.com/channels/485586884165107732/485596304961962003/563149775340568576\">Does DVC use the same logins <html><head></head><body><code class=\"language-text\">aws-cli</code></body></html> has when using an S3 bucket as its repo/remote storage</a>?</h3>\n<p>In short — yes, but it can be also configured. DVC is going to use either your\ndefault profile (from <html><head></head><body><code class=\"language-text\">~/.aws/*</code></body></html>) or your env vars by default. If you need more\nflexibility (e.g. you need to use different credentials for different projects,\netc) check out\n<a href=\"https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-profiles.html\">this guide</a>\nto configure custom aws profiles and then you could use them with DVC using\nthese\n<a href=\"https://dvc.org/doc/commands-reference/remote-add#options\">remote options</a>.</p>\n<h3>Q: <a href=\"https://discordapp.com/channels/485586884165107732/485596304961962003/566000729505136661\">How can I output multiple metrics from a single file?</a></h3>\n<p>Let’s say I have the following in a file:</p>\n<html><head></head><body><div class=\"gatsby-highlight\" data-language=\"json\"><pre class=\"language-json\"><code class=\"language-json\"><span class=\"token punctuation\">{</span>\n  “AUC_RATIO”<span class=\"token operator\">:</span>\n    <span class=\"token punctuation\">{</span>\n      “train”<span class=\"token operator\">:</span> <span class=\"token number\">0.8922748258797667</span><span class=\"token punctuation\">,</span>\n      “valid”<span class=\"token operator\">:</span> <span class=\"token number\">0.8561602726251776</span><span class=\"token punctuation\">,</span>\n      “xval”<span class=\"token operator\">:</span> <span class=\"token number\">0.8843431199314923</span>\n    <span class=\"token punctuation\">}</span>\n<span class=\"token punctuation\">}</span></code></pre></div></body></html>\n<p>How can I show both <html><head></head><body><code class=\"language-text\">train</code></body></html> and <html><head></head><body><code class=\"language-text\">valid</code></body></html> without <html><head></head><body><code class=\"language-text\">xval</code></body></html>?</p>\n<p>You can use <html><head></head><body><code class=\"language-text\">dvc metrics show</code></body></html> command <html><head></head><body><code class=\"language-text\">--xpath</code></body></html> option and provide multiple\nattribute names to it:</p>\n<html><head></head><body><div class=\"gatsby-highlight\" data-language=\"dvc\"><pre class=\"language-dvc\"><code class=\"language-dvc\"><span class=\"token line\"><span class=\"token input\">$ </span><span class=\"token dvc\">dvc metrics show</span> metrics.json <span class=\"token punctuation\">\\</span>\n                  --type json <span class=\"token punctuation\">\\</span>\n                  --xpath AUC_RATIO<span class=\"token punctuation\">[</span>train,valid<span class=\"token punctuation\">]</span>\n</span>    metrics.json:\n                 0.89227482588\n                 0.856160272625</code></pre></div></body></html>\n<h3>Q: <a href=\"https://discordapp.com/channels/485586884165107732/485596304961962003/566314479499870211\">What is the quickest way to add a new dependency to a DVC-file?</a></h3>\n<p>There are a few options to add a new dependency:</p>\n<ul>\n<li>simply opening a file with your favorite editor and adding a dependency there\nwithout md5. DVC will understand that that stage is changed and will re-run\nand re-calculate md5 checksums during the next DVC repro;</li>\n<li>use <html><head></head><body><code class=\"language-text\">dvc run --no-exec</code></body></html> is another option. It will rewrite the existing file\nfor you with new parameters.</li>\n</ul>\n<h3>Q: <a href=\"https://discordapp.com/channels/485586884165107732/485596304961962003/566315265646788628\">Is there a way to add a dependency to a python package, so it runs a stage again if it imported the updated library?</a></h3>\n<p>The only recommended way so far would be to somehow make DVC know about your\npackage’s version. One way to do that would be to create a separate stage that\nwould be dynamically printing version of that specific package into a file, that\nyour stage would depend on:</p>\n<html><head></head><body><div class=\"gatsby-highlight\" data-language=\"dvc\"><pre class=\"language-dvc\"><code class=\"language-dvc\"><span class=\"token line\"><span class=\"token input\">$ </span><span class=\"token dvc\">dvc run</span> -o mypkgver 'pip show mypkg <span class=\"token operator\">></span> mypkgver’\n</span><span class=\"token line\"><span class=\"token input\">$ </span><span class=\"token dvc\">dvc run</span> -d mypkgver -d <span class=\"token punctuation\">..</span>. -o <span class=\"token punctuation\">..</span> mycmd</span></code></pre></div></body></html>\n<h3>Q: <a href=\"https://discordapp.com/channels/485586884165107732/485596304961962003/564807276146458624\">Is there anyway to forcibly recompute the hashes of dependencies in a pipeline DVC-file?</a></h3>\n<p>E.g. I made some whitespace/comment changes in my code and I want to tell DVC\n“it’s ok, you don’t have to recompute everything”.</p>\n<p>Yes, you could <html><head></head><body><code class=\"language-text\">dvc commit -f</code></body></html>. It will save all current checksum without\nre-running your commands.</p>\n<h3>Q: <a href=\"https://discordapp.com/channels/485586884165107732/485596304961962003/563352000281182218\">I have projects that use data that’s stored in S3. I never have data locally to use <html><head></head><body><code class=\"language-text\">dvc push</code></body></html>, but I would like to have this data version controlled.</a> Is there a way to use the features of DVC in this use case?</h3>\n<p>Yes! This DVC features is called\n<a href=\"https://dvc.org/doc/user-guide/external-outputs\">external outputs</a> and\n<a href=\"https://dvc.org/doc/user-guide/external-dependencies\">external dependencies</a>.\nYou can use one of them or both to track, process, and version your data on a\ncloud storage without downloading it locally.</p>\n<html><head></head><body><hr></body></html>\n<p>If you have any questions, concerns or ideas, let us know\n<a href=\"https://dvc.org/support\">here</a> and our stellar team will get back to you in no\ntime!</p>","timeToRead":11,"fields":{"slug":"/may-19-dvc-heartbeat"},"frontmatter":{"title":"May ’19 DVC❤️Heartbeat","date":"May 21, 2019","description":"DVC accepted into Google Season of Docs 🎉, Dmitry's talk at the O’Reilly AI\nConference, new portion of Discord gems, and articles either created or\nbrought to us by our community.\n","descriptionLong":"Every month we are sharing here our news, findings, interesting reads,\ncommunity takeaways, and everything along the way.\nSome of those are related to our brainchild DVC and its journey. The others\nare a collection of exciting stories and ideas centered around ML best\npractices and workflow.\n","tags":["Heartbeat","Discord Gems","DVC","Google Season of Docs"],"commentsUrl":"https://discuss.dvc.org/t/may-19-dvc-heartbeat/290","author":{"childMarkdownRemark":{"frontmatter":{"name":"Svetlana Grinchenko","avatar":{"childImageSharp":{"fixed":{"base64":"data:image/jpeg;base64,/9j/2wBDABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkzODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2P/2wBDARESEhgVGC8aGi9jQjhCY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2P/wgARCAAUABQDASIAAhEBAxEB/8QAGAABAAMBAAAAAAAAAAAAAAAAAAIDBQT/xAAUAQEAAAAAAAAAAAAAAAAAAAAA/9oADAMBAAIQAxAAAAHRz7x3qhnQCoH/xAAaEAACAwEBAAAAAAAAAAAAAAABAgMEMxIU/9oACAEBAAEFArBJZ0aqyN0s+trCvhYchTKzj0SRD//EABQRAQAAAAAAAAAAAAAAAAAAACD/2gAIAQMBAT8BH//EABQRAQAAAAAAAAAAAAAAAAAAACD/2gAIAQIBAT8BH//EAB0QAAICAgMBAAAAAAAAAAAAAAECABExQQMSIVH/2gAIAQEABj8CXiVuvbcDK9jYMDDcRyLBFTHpIixRflQjAHyBVwJ//8QAGxABAAMAAwEAAAAAAAAAAAAAAQARQSExUXH/2gAIAQEAAT8ht1LLoIkF4B6wLgOVH8uRVjiTT5AoqBr2L1odagwijiyf/9oADAMBAAIAAwAAABBjDwD/xAAUEQEAAAAAAAAAAAAAAAAAAAAg/9oACAEDAQE/EB//xAAUEQEAAAAAAAAAAAAAAAAAAAAg/9oACAECAQE/EB//xAAeEAACAwADAAMAAAAAAAAAAAABEQAhMUFRYXGBkf/aAAgBAQABPxCmAzsDr5hIjqwzzzqjPahi5r8IQVrj2MCG47GtJ+o5KCJ7t+zbocIBE8rYP/mqRYTv5EF8QZIHU//Z","width":40,"height":40,"src":"/static/fcc8502faa36f9a989fa0651c3c21653/d83e5/svetlana_grinchenko.jpg","srcSet":"/static/fcc8502faa36f9a989fa0651c3c21653/d83e5/svetlana_grinchenko.jpg 1x,\n/static/fcc8502faa36f9a989fa0651c3c21653/58860/svetlana_grinchenko.jpg 1.5x,\n/static/fcc8502faa36f9a989fa0651c3c21653/90ac5/svetlana_grinchenko.jpg 2x","srcWebp":"/static/fcc8502faa36f9a989fa0651c3c21653/e145b/svetlana_grinchenko.webp","srcSetWebp":"/static/fcc8502faa36f9a989fa0651c3c21653/e145b/svetlana_grinchenko.webp 1x,\n/static/fcc8502faa36f9a989fa0651c3c21653/0d42c/svetlana_grinchenko.webp 1.5x,\n/static/fcc8502faa36f9a989fa0651c3c21653/f46db/svetlana_grinchenko.webp 2x"}}}}}},"picture":{"childImageSharp":{"fluid":{"base64":"data:image/jpeg;base64,/9j/2wBDABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkzODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2P/2wBDARESEhgVGC8aGi9jQjhCY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2P/wgARCAAQABQDASIAAhEBAxEB/8QAGAAAAgMAAAAAAAAAAAAAAAAAAAMBAgT/xAAVAQEBAAAAAAAAAAAAAAAAAAABA//aAAwDAQACEAMQAAABZZUxrrFA/wD/xAAYEAEBAQEBAAAAAAAAAAAAAAABABESQf/aAAgBAQABBQLoLoYcPcLb/8QAFhEAAwAAAAAAAAAAAAAAAAAAARAR/9oACAEDAQE/ATV//8QAFhEAAwAAAAAAAAAAAAAAAAAAARAR/9oACAECAQE/ARF//8QAGxAAAgIDAQAAAAAAAAAAAAAAAAExMhIhIqH/2gAIAQEABj8CIZqBZdJFPSp//8QAHRABAAICAgMAAAAAAAAAAAAAAQAhEVFBYYGh0f/aAAgBAQABPyERZTqHjEbA+kwtgVEN/j6gwDL1P//aAAwDAQACAAMAAAAQ6x//xAAXEQEAAwAAAAAAAAAAAAAAAAAAARGB/9oACAEDAQE/EIXY/8QAFhEBAQEAAAAAAAAAAAAAAAAAACGB/9oACAECAQE/EItf/8QAHhABAAICAgMBAAAAAAAAAAAAAQAhEUExUWGBwdH/2gAIAQEAAT8QBKJKOj5g+Eeid8wERGxqNtwKRKq9+47HDzNsfUGhoXfb8xP/2Q==","aspectRatio":1.250103348491112,"src":"/static/6f52463e2d2faba533fb4ff7e0501b49/6fdf8/post-image.jpg","srcSet":"/static/6f52463e2d2faba533fb4ff7e0501b49/9fc73/post-image.jpg 213w,\n/static/6f52463e2d2faba533fb4ff7e0501b49/ee221/post-image.jpg 425w,\n/static/6f52463e2d2faba533fb4ff7e0501b49/6fdf8/post-image.jpg 850w,\n/static/6f52463e2d2faba533fb4ff7e0501b49/88a70/post-image.jpg 1275w,\n/static/6f52463e2d2faba533fb4ff7e0501b49/15ae8/post-image.jpg 1700w,\n/static/6f52463e2d2faba533fb4ff7e0501b49/4b3e1/post-image.jpg 3024w","srcWebp":"/static/6f52463e2d2faba533fb4ff7e0501b49/5c1d9/post-image.webp","srcSetWebp":"/static/6f52463e2d2faba533fb4ff7e0501b49/99b2d/post-image.webp 213w,\n/static/6f52463e2d2faba533fb4ff7e0501b49/23220/post-image.webp 425w,\n/static/6f52463e2d2faba533fb4ff7e0501b49/5c1d9/post-image.webp 850w,\n/static/6f52463e2d2faba533fb4ff7e0501b49/5e720/post-image.webp 1275w,\n/static/6f52463e2d2faba533fb4ff7e0501b49/35cfd/post-image.webp 1700w,\n/static/6f52463e2d2faba533fb4ff7e0501b49/45987/post-image.webp 3024w","sizes":"(max-width: 850px) 100vw, 850px","presentationWidth":850}}},"pictureComment":"Kudos to StickerMule.com for our amazing stickers (and great customer service)!"}}},"pageContext":{"next":{"fields":{"slug":"/june-19-dvc-heartbeat"},"frontmatter":{"title":"June ’19 DVC❤️Heartbeat"}},"previous":{"fields":{"slug":"/dvc-project-ideas-for-google-summer-of-docs-2019"},"frontmatter":{"title":"DVC project ideas for Google Season of Docs 2019"}},"currentPage":11,"slug":"/may-19-dvc-heartbeat"}}}