Plan 4: Get info from a single tag

Plan 4: Example

Maybe we want to get just one piece of information from a webpage. In this example, we want to get the first link to a news story from a professor’s page.

Here’s what we see when we use the “inspect” function in the browser.

The tag that has a link to the news article

Since that tag is the first of its type on the page, we can use the plan Get info from a single tag.

Goal: Get info from a single tag
# Get first tag of a certain type from the soup
tag = soup.find('a', class_='item-teaser--more')
# Get info from tag
info = tag.get('href')

Plan 4: When to use it

Use this when you want to get information that is in the first tag of a certain type on the page.

Plan 4: How to use it

Once you’ve found the tag you want to get information from, do two things:

  1. Find the tag description and put it into the first slot.

How do you do that? Here are some examples:

What you see when you inspect

Tag description in the code

<p>

->

'p'

<h3>

->

'h3'

<div class="comment">

->

'div', class_='comment'

<span style="X5e72;">

->

'span', style='X5e72;'

<a class="css4z" href="/orders">

->

'a', class_='css4z'

  1. Determine if you want to get text from a tag, or a link from a tag

The info you want

What you put in the code

The tag’s text

->

text

The tag’s link

->

get('href')

One type of tag, the a tag, holds a link.

Here is the tag that creates the link to the North Quad dining hall page. It is an ‘a’ tag.

Link to the North Quad dining hall page

If you want to get the link from a tag, use get('href') in the second slot in this plan.

Plan 4: Exercises

Check out the image below, that inspects the description of the North Quad dining hall.

The tag that creates the description of North Quad

Choose the subgoals that get the text from the tag that has the description of the North Quad dining hall, and put them in the right order. You do not need to use all the blocks.

You have attempted of activities on this page