Contact
CoCalc Logo Icon
StoreFeaturesDocsShareSupport News AboutSign UpSign In
| Download
Views: 1574
Kernel: Python 2

File handling – exercise

  • Lets write a (new) file

  • Use as a name of the file: "file.txt"

  • By using "w" it will overwrite any existing files

  • The file will be located in the current working directory, unless you specify the entire path before the filename

# Open a file file = open("file.txt", "w") print "Name of the file: ", file.name

File handling – good practice

  • After opening also close your files:

  • The close() method of a file object flushes any unwritten information and closes the file object, after which no more writing can be done. It is a good practice to use close().

# Open a file file = open("file.txt", "w") # do stuff file.close()

File handling – good practice

  • Before opening a file its also good to actually check the file exists

import sys ## module System-specific parameters and functions try: file = open("file.txt", "r") # do stuff file.close() except: sys.exit("File does not exist!")

File handling – writing

  • To write to a file you use the command below:

    • file.write("What you want to write")

  • Try yourself to write some variables to a file

file = open("file.txt", "w") file.write("Hello script!\n") ## write directly line = "This is my output!" file.write(line+"\n") ## write string file.close()

File handling – reading

  • Try to read the lines you just wrote to a file in the previous exercise

# Most frequently a file is being read line by line using a loop. # This method also reads a file line by line just like readline() # Example: file = open("file.txt", "r") for line in file: print line, file.close()

Newline characters

  • Most of the times the newline characters at the end of a line are simply said “annoying”, we can remove them using the following command:

line.rstrip() ## removes newline character

File handling – splitting the lines

  • To split your line, you can use the line.split() function using any delimiter (example below)

  • But of course if we want to split on a bit more difficult pattern we rather use the Regex split function we discussed this morning

file = open("file.txt", "r") for line in file: splitline = line.split()## you can use different (deliminators)! print splitline file.close()

File handling – exercise

  • The file "exercise.bed" contains genomic regions and is rather large, the perfect opportunity to use Python to process this file

  • Read the BED file and make an output file were "chr" in front of the number is removed (e.g. chr1 will be 1).

  • Determine the number of regions covered in each chromosome (e.g. each line is a region chromosome 1 --> 71906 regions)

  • Parse and print to screen the total number of regions and size covered for each chromosome

  • Challenge: combine number 1 and 2

# Type here your code

As frequently requested two examples:

  • Limit the raw_input to only nucleotides:

import re input_str = "" while not re.match("^[actg]{1,}$", input_str,re.I): input_str = raw_input("Please provide some nucleotides:") print input_str

As frequently requested two examples:¶

  • make reverse complement a sequence easy:

from string import maketrans dna_code = "aCGttgagatcagat" complement = maketrans("acgtACGT", "tgcaTGCA") print "dna_code: ", dna_code print "complement: ", dna_code.translate(complement) print "reverse complement: ", dna_code.translate(complement)[::-1]