14.6. Plan 5: Get info from all tags of a certain type

To get information from the Cottage Inn locations page, we need to figure out which tags we should get from the soup, and what information we should get from the tags.

A great way to figure this out is to use the “inspect” function on your browser.

By inspecting the locations, we see that they are all h3 tags.

We see that we need to get info from all the h3 tags from the webpage. The text in those tags has the information we need!

14.6.1. Looking closer at a tag

Behind every webpage is HTML code. HTML code is made up of tags.

Here is the tag that creates the name of one of the Cottage Inn Pizza locations. The tag is surrounded by the blue rectangle. It is an ‘h3’ tag.

h3 tag example

The name of this tag is ‘h3’. In-between the start and end tag (between the <h3> and </h3> is the tag’s text. For this tag, the text is Ann Arbor Broadway St.

14.6.2. Plan 5: Example

Here is how to get text from all the ‘h3’ tags from webpage:

Goal: Get info from all tags of a certain type
# Get all tags of a certain type from the soup
tags = soup.find_all('h3')
# Collect info from the tags
collect_info = []
for tag in tags:
    # Get info from tag
    info = tag.text
    collect_info.append(info)

14.6.3. Plan 5: How to use it

Once you’ve found the tags you want to get information from, do two things:

  1. Find the tag description and put it into the first slot.

How do you do that? Here are some examples:

What you see when you inspect

Tag description in the code

<p>

->

'p'

<h3>

->

'h3'

<div class="comment">

->

'div', class_='comment'

<span style="X5e72;">

->

'span', style='X5e72;'

<a class="css4z" href="/orders">

->

'a', class_='css4z'

  1. Determine if you want to get text from a tag, or a link from a tag

The info you want

What you put in the code

The tag’s text

->

text

The tag’s link

->

get('href') or get('href', None)

14.6.4. Plan 5: Exercises

You have attempted of activities on this page