Reading and Writing Files from Python

Packages

Installation

Packages provide additional tools and functions not present in base Python. Python includes a number of packages to start with, and others can be installed using pip install <package name> and/or conda install <package name> commands in your terminal.

Open your terminal by:

  • (PC) Start > Anaconda3(64-bit) > Anaconda Prompt
  • (Mac) Finder > Applications > Anaconda Navigator > Environments Tab > (play button listed next to "root") > Open Terminal

Loading

Once you've installed a package, you can load it into your current Python session with the import function. Otherwise these functions will not be available.

In [1]:
import os #functions for working with your operating system
import shutil #extra functions for working with files

Both of the packages above, os and shutil, are part of the "Python Standard Library" of packages and functions that come with every Python installation.

We don't load all of these packages every time we start Python for a couple of major reasons:

  1. If we don't need the functions, it just takes up space.
  2. Many function names are re-used across packages. When this happens, whichever function's module was loaded last wins! Loading fewer packages often avoids this problem.

Note: If you need two functions with the same name, for example, fun, from pkg1 and pkg2 we can always refer to them by their "full" names as: package-name.function-name :

pkg1.fun()
pkg2.fun()

Working Directories

To open a file with Python, you'll need to tell your computer where it's located on your computer. You can specify the entire absolute filepath (starting with C:\ on PC or / on Mac), or you can set a working directory and work with relative file paths.

You can determine where a file is located on your computer by:

  • (PC) Navigate to your desired folder in Windows Explorer and click on it. Click in the address bar at the top of the window to copy the path.
  • (Mac) Right-click a file in your desired directory > Click Get Info > Highlight and copy the path listed next to "Where:"
  • (Alternate Mac) Right-click a file in your desired directory > Hold down the Option key > Click Copy "file_name" as Pathname.

If a file is located in your working directory, its relative path is just the name of the file!

In [2]:
myfile="/Users/tuesday/Desktop/Python/Recipes.zip" #Mac absolute path
os.path.isfile(myfile) #check if Python can find my file 
Out[2]:
True

Windows Paths

Windows filepaths use \, which Python interprets as escape characters. This can be fixed in several ways:

  • Replace \ with /.
  • Replace \ with \\.
  • Preface your path with r:

     r"C:\Users\mtjansen\Desktop"
In [3]:
os.chdir("/Users/tuesday/Desktop/Python/") #set working directory
myfile="Recipes.zip" #relative path
os.path.isfile(myfile)
Out[3]:
True

We can get a list of all files in the working directory with os.listdir(".").

In [4]:
print(os.listdir("."))
print(os.listdir("/Users/tuesday/Desktop/Python/")) #alternatively we can specify a folder
['.DS_Store', '.ipynb_checkpoints', 'Files_Packages.ipynb', 'getrecipes.py', 'RBK', 'Recipes', 'Recipes.zip', 'solution_files_packages.py']
['.DS_Store', '.ipynb_checkpoints', 'Files_Packages.ipynb', 'getrecipes.py', 'RBK', 'Recipes', 'Recipes.zip', 'solution_files_packages.py']

Exercise

  1. Download this zipped file: Recipes.zip.
  2. Unzip the file somewhere on your computer.
  3. Use import os and os.chdir to set your working directory to the unzipped folder "Recipes".
  4. Use os.listdir to check what files are stored in "Recipes".

Extra: Reading Files

Python requires you both open and close files explicitly. If you forget to close a file, it can remain in use, preventing you from opening it later.

Best practices for reading and writing files use the with function to make sure files are automatically closed.

In [5]:
os.chdir("/Users/tuesday/Desktop/Python/Recipes")

with open("amaranth-stirfry.txt","r") as txtfile: #"r" indicates that we are reading the textfile and not writing to it
    recipe=txtfile.read() #.read() retrieves raw text information from the file we opened
    
print(recipe)
AMARANTH STIR FRY

1 c amaranth, uncooked
1/8 ts sea salt
1 tb olive oil
1 c leeks, sliced
1/2 c mushrooms, sliced
1/2 c green peppers, sliced
1 t soy sauce
1/2 c whole wheat bread crumbs
1/2 c scallions, sliced
1/2 c pumpkin seeds, toasted

Rinse and drain amaranth. Dry roast amaranth in a hevy skillet over
medium heat for 5 minutes. Bring 3 cups water and salt to a boil.
Stir in amaranth and return to a boil. Lower heat and place a flame
deflector or heat diffuser under the pot. Cover and simmer for 35
minutes or until all water is absorbed, stirring occasionally. Heat
a skillet and brush generously with oil. Add leeks and saute for
5 minutes. Add mushrooms and peppers and saute for 10 minutes,
stirring often. Sprinkle with soy sauce and one teaspoon water.
Sprinkle bread crumbs over top of vegetables. Place amaranth on
top of crumbs. Cover and heat through. Stir to combine all ingredients,
place in a serving dish and garnish with scallions and pumpkin
seeds.



Extra: Writing Files

The recipe above is missing a serving amount. Lets add one in, and then save the file.

In [6]:
recipe = recipe + "Serves 4"

with open("amaranth-stirfry.txt","w") as txtfile: #"w" specifies that we're writing to the file
    txtfile.write(recipe)

Extra: Working with Folders

The Amaranth Stir Fry looks like it would make a nice hearty meal for fall. Let's create a new folder called "Fall" in our "Recipes" folder and put a copy of the Amaranth Stir Fry recipe inside it.

In [7]:
os.mkdir("Fall") #os.mkdir() creates a new folder 
shutil.copyfile("amaranth-stirfry.txt", "Fall/amaranth-stirfry.txt") #shutil.copyfile() makes a copy
Out[7]:
'Fall/amaranth-stirfry.txt'

Did it work?

In [8]:
os.path.isfile("Fall/amaranth-stirfry.txt")
Out[8]:
True

Great! We only have 199 more recipes to organize by season! Don't worry, though. This is a fast job for Python. We'll start by writing some pseudocode.

To organize our recipes, we'll need to...

  1. Make a list that contains all of our recipe files.
  2. Create folders for each season.
  3. Create a dictionary with lists of ingredients for each season.
  4. Loop through the list of recipe files.
    • Open the file.
    • Loop through each season in the dictionary.
      • Loop through the ingredients in each season.
        • If an ingredient appears in the recipe, copy the recipe to the correct folder.

Using a comprehension, make a list that contains all of our recipe files:

In [9]:
#this list comprehension makes sure we're only getting a list of our text files and our folder is not included
flist = [f for f in os.listdir("/Users/tuesday/Desktop/Python/Recipes") if f[-3:]=="txt"] 

Create folders for the remaining seasons:

In [10]:
os.mkdir("Spring")
os.mkdir("Winter")
os.mkdir("Summer")

Create lists of ingredients for each season:

In [11]:
spring = ["asparagus", "cabbage", "cauliflower", "chard", "greens", "kale", "peas", "radish", "rhubarb", "strawberries", "turnip", "artichoke"]

summer = ["blackberries", "blueberries", "cantaloupe", "cherries", "cucumber", "eggplant", "beans", "melon", "okra", "peach", "plum", "raspberries", "strawberries", "watermelon", "zucchini", "apricot", "basil"]

fall = ["apple", "brussels sprouts", "cabbage", "cauliflower", "grapes", "mushrooms", "parsnip", "pear", "sweet potato", "pumpkin", "turnip", "rutabaga", "fig", "quince", "pomegranate", "chard", "greens", "kale", "butternut", "acorn", "cranberries"]

winter = ["grapefruit", "orange", "butternut", "acorn", "chestnut", "cranberries", "brussels sprouts", "cabbage", "cauliflower", "sweet potato", "pumpkin", "turnip", "rutabaga", "pomegranate", "chard", "greens", "kale"]

Notice that some ingredients fall into multiple seaons. Some of our recipes will also fall into multiple seasons.

Next, we need to combine all of our ingredient lists into a single dictionary so that we can loop through them later on.

In [12]:
seasons = {"Spring":spring, "Summer":summer, "Fall":fall, "Winter":winter}



Exercise

  1. Write a simple loop with two lines of code that will print out all of the ingredients for each season.

Before we tackle the fourth step in our pseudocode, let's take a look at the recipe for Apple Carrot Muffins. We'll practice classifying just this recipe first.

In [13]:
fname = "apple-carrot-muffins.txt" #Store the file name in a variable called "fname"

with open(fname,"r") as txtfile: 
    recipe=txtfile.read()
    
print(recipe)
Apple-Carrot Muffins
Servings 12
 
2 1/2 cups whole wheat flour
1/2 cup soy powder
1 tsp. baking powder
1/4 tsp. salt
1/4 tsp. nutmeg
1/4 tsp[. cinnamon
1/8 cup oil
3/4 cup honey
1 tsp. vanilla
1/2 cup grated apple
1/2 cup grated carrot
 
Preheat oven to 400 degrees F.  In a medium bowl, combine all the
dry ingredients.  Combine all the liquid ingredients in a large
bowl; stir in the apple and carrot.  Add the dry ingredients to
the liquid mixture.  Oil one muffin tin; then spoon the batter into
the cups until each is 2/3 full.  Bake for 20 minutes, or until a
toothpick stuck in the center of the muffin comes out clean.



What season will this recipe fall into? To find out, we need to build some nested loops. Remember our pseudocode?

  • Loop through each season in the dictionary.
    • Loop through the ingredients in each season.
      • If an ingredient appears in the recipe, copy the recipe to the correct folder.
In [14]:
 for s in seasons:
        for ingredient in seasons[s]:           
             if (ingredient in recipe.lower()):  #.lower() changes all text in the recipe to lowercase
                shutil.copyfile(fname, os.path.join(s, fname)) #os.path.join() joins the folder name with the file name
    

Now we can use os.list() to find out which folder the recipe was placed in.

In [15]:
for s in seasons:      #Loop through each season folder
    print(s)           #Print the folder name
    print(os.listdir(s))  #List the files in the folder.
Spring
[]
Summer
[]
Fall
['amaranth-stirfry.txt', 'apple-carrot-muffins.txt']
Winter
[]
Exercise
  1. Complete step 4 of the pseudocode for all of the recipes. Remember that you will need to loop through all files in flist and open each file.

If you're having a lot of trouble with this exercise, one possible solution can be found here.