Along the way we’ll learn a few extra commands to keep things interesting. Instead of opening these files in a text editor and replacing all the spaces with underscores or replacing the periods with hyphens, we can fix the information using sed. I’d prefer to have those be hyphens to represent gap characters. Later on, we might decide to toss those sequences because they’re weird. In those cases, mothur starts the sequence with a series of periods to indicate missing data. ![]() Have I mentioned how horrible spaces are for bioinformatics work?! I also noticed that although most of our sequences start and end at the coordinates that we trimmed them to, there are a few for each region that don’t. Heck, the name grep is short for “globally search for a regular expression and print matching lines”.Īfter the last episode, I was looking back through our files and noticed that mothur had changed our sequence names because the names had spaces in them. There’s another, probably more popular tool called grep where we can use regular expressions. But, sed isn’t the only place that we can use regular expressions in bash. If you did the exercises in the sed episode, I showed how you can run sed on the contents of a file rather than its name. In the last two episodes, we used special patterns called “regular expressions” with sed to extract information from our file names and paths. In today’s episode, we’ll see many of those commands and some new ones to help solve a problem I found in our analysis. ![]() But, if these still seem a bit challenging to you, don’t fret! We’re going to spend a few more episodes in bash to strengthen our familiarity with these commands and my general workflow. If you have all those down, then you’re in great shape and are likely seeing the value of using these bash commands to automate a reproducible workflow. You’ve learned a lot of bash syntax - mkdir, cd, ls, pwd, touch, rm, rmdir, nano, git, make, sed, |, if, mv, and probably a few others. Since starting this project 10 episodes ago, we have yet to really leave the command line interface. Fun with regular expressions in sed and grep
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |