{"id":3723,"date":"2019-12-29T04:09:13","date_gmt":"2019-12-29T04:09:13","guid":{"rendered":"http:\/\/nayrb.org\/~blog\/?p=3723"},"modified":"2019-12-29T04:09:13","modified_gmt":"2019-12-29T04:09:13","slug":"the-mysterious-case-of-the-regex-dot","status":"publish","type":"post","link":"http:\/\/nayrb.org\/~blog\/2019\/12\/29\/the-mysterious-case-of-the-regex-dot\/","title":{"rendered":"The Mysterious Case of the Regex Dot"},"content":{"rendered":"<p>So, I&#8217;m in the middle of organizing my photos into folders, something more useable than the default Photos application on Mac[1].<\/p>\n<p>While trying to count the number of photos\/videos[2] in each subdirectory in my &#8230;\/2018\/ folder:<\/p>\n<p>$ time find * |grep IMG|grep -o &#8216;^[0-9][0-9]\/.&#8217;|uniq -c<br \/>\n  22 04\/0<br \/>\n3297 05\/1<br \/>\n 104 05\/2<br \/>\n 100 06\/0<br \/>\n1830 06\/2<br \/>\n2040 10\/2<\/p>\n<p>I first tried the supposedly logical:<\/p>\n<p>$ time find * |grep IMG|grep -o ^..|uniq -c|head<br \/>\n   1 04<br \/>\n   1 \/0<br \/>\n   1 1\/<br \/>\n   1 20<br \/>\n   1 18<br \/>\n   1 04<br \/>\n   1 01<br \/>\n   1 -0<br \/>\n   1 00<br \/>\n   1 41<\/p>\n<p>Interestingly, grep (and\/or the OS) seemed to be taking the front off of each line, and then putting it back into the STDIN hopper for the next call to grep.<\/p>\n<p>As this was not doing what I expected (nor wanted), I tried:<\/p>\n<p>$ time find * |grep IMG|grep -o &#8216;^[0-9][0-9]\/&#8217;|uniq -c|head<br \/>\n   1 04\/<br \/>\n   1 01\/<br \/>\n   1 04\/<br \/>\n   1 01\/<br \/>\n   1 04\/<br \/>\n   1 01\/<br \/>\n   1 04\/<br \/>\n   1 01\/<br \/>\n   1 04\/<br \/>\n   1 01\/<\/p>\n<p>Which, while better&#8230;<\/p>\n<p>$ time find * |grep IMG|grep -o &#8216;^[0-9][0-9]\/&#8217;|uniq -c|sort|uniq -c<br \/>\n  22    1 01\/<br \/>\n  22    1 04\/<br \/>\n3501    1 05\/<br \/>\n1930    1 06\/<br \/>\n2040    1 10\/<br \/>\n3297    1 13\/<br \/>\n 104    1 23\/<br \/>\n1830    1 24\/<br \/>\n2040    1 27\/<\/p>\n<p>&#8230;gave me too many results by about a factor of two, and somehow found 27 months in the year.<\/p>\n<p>I quickly figured out that while parsing mm\/dd\/yyyymmdd-hash\/IMG_[0-9][0-9][0-9][0-9].[FILETYPE], this particular grep\/OS combination will happily grab the &#8216;mm\/&#8217;, and then also grab the &#8216;dd\/&#8217;.  This habit, while charming, does not solve my problem.<\/p>\n<p>After google searching https:\/\/www.google.com\/search?q=grep+one+match+per+line proved unfruitful, I decided to try:<\/p>\n<p>$ time find * |grep IMG|grep -o &#8216;^[0-9][0-9]\/.&#8217;|uniq -c<br \/>\n  22 04\/0<br \/>\n3297 05\/1<br \/>\n 104 05\/2<br \/>\n 100 06\/0<br \/>\n1830 06\/2<br \/>\n2040 10\/2<\/p>\n<p>and it worked!<\/p>\n<p>I was stumped, until I figured out that the issues that I had been seeing before were entirely because grep was finding results at the start of the newly chomped string, and that by chomping part of the next &#8216;match&#8217;, I was stopping grep from finding any more matches.<\/p>\n<p>#themoreyouknow<\/p>\n<p>[1] Right now, when Photos organizes photos, it puts each photo into its own folder, based on year\/month\/day\/yyyymmdd-hash, which makes it super-annoying to use anything about the Photos app, which is super-slow and annoying to use.<\/p>\n<p>[2] The images are all in the format &#8216;IMG_[0-9][0-9][0-9][0-9].[FILETYPE]&#8217;, where FILETYPE can be &#8216;PNG&#8217; (screenshots), &#8216;JPG&#8217; (camera pictures), &#8216;MOV&#8217; (camera movies), &#8216;GIF&#8217; (saved .gifs), or perhaps some other recognized image format.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>So, I&#8217;m in the middle of organizing my photos into folders, something more useable than the default Photos application on Mac[1]. While trying to count the number of photos\/videos[2] in each subdirectory in my &#8230;\/2018\/ folder: $ time find * |grep IMG|grep -o &#8216;^[0-9][0-9]\/.&#8217;|uniq -c 22 04\/0 3297 05\/1 104 05\/2 100 06\/0 1830 06\/2 &hellip; <a href=\"http:\/\/nayrb.org\/~blog\/2019\/12\/29\/the-mysterious-case-of-the-regex-dot\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">The Mysterious Case of the Regex Dot<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[30],"tags":[],"_links":{"self":[{"href":"http:\/\/nayrb.org\/~blog\/wp-json\/wp\/v2\/posts\/3723"}],"collection":[{"href":"http:\/\/nayrb.org\/~blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/nayrb.org\/~blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/nayrb.org\/~blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/nayrb.org\/~blog\/wp-json\/wp\/v2\/comments?post=3723"}],"version-history":[{"count":2,"href":"http:\/\/nayrb.org\/~blog\/wp-json\/wp\/v2\/posts\/3723\/revisions"}],"predecessor-version":[{"id":3725,"href":"http:\/\/nayrb.org\/~blog\/wp-json\/wp\/v2\/posts\/3723\/revisions\/3725"}],"wp:attachment":[{"href":"http:\/\/nayrb.org\/~blog\/wp-json\/wp\/v2\/media?parent=3723"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/nayrb.org\/~blog\/wp-json\/wp\/v2\/categories?post=3723"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/nayrb.org\/~blog\/wp-json\/wp\/v2\/tags?post=3723"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}