Chapter 2: HTML

Section 1: Generating HTML

We have generated HTML using our Web Microworld blocks in Snap. For example, here’s a very tiny web page definition in Snap blocks.

_images/simple-web-page-blocks.png

This generates a page that looks like this

_images/simple-webpage.png

The Snap stage shows us the HTML that was generated for this page. Remember this – we’ll compare to it later.

_images/simple-web-page-stage.png

The Python Version

Here is a Python program that does the exact same thing as the Snap blocks above. Click “Run” to see the HTML that this generates.

The execution of this program looks pretty similar to what appears on the stage. Both the Snap and Python programs generate the same HTML. They are doing the same tasks.

The Python code might look longer and more complicated, but part of that is because of abstraction. We gave you a set of blocks that hides some of the details. Here is what is inside the “Make a Heading” block. Looks pretty close to the start of the Python program, isn’t it? The block hides away a lot of the detail and complexity – that’s abstraction.

_images/make-a-heading-level.png

Let’s go through how this works:

  • We are defining two functions using def. That is the keyword in Python for defining functions. One function is for creating headers. The other is for creating paragraphs. We use “+” to combining strings, like join in Snap.

  • We are building the HTML for our webpage in the variable webpage.

  • Instead of webpage = webpage +, Python gives us a short form: webpage += that adds something to the end of the webpage variable.

  • At the end, we print the HTML in the variable webpage.

Try answering these questions about the Python code above.

Section 2: Scraping HTML

We built a set of blocks in Snap Web Microworld blocks that allow us to pull the content out of Web pages and scrape that content. That is, we figure out what parts we want and return it.

Here, we grab all the URLs from this website, where the ebook is located.

_images/runestone-scrape.png

The result looks like this:

_images/runestone-scrape-run.png

The Python Version

Here is a Python program that does the exact same thing as the Snap blocks above. Click “Run” to see what it generates.

Let’s go through how this works:

  • We are loading a library called requests which gives us the ability to read URLs.

  • We are defining a function web_scraper that takes a URL, then we get the content. That content is split into parts. We look for “href” in the part, then strip away the “href=”. We print what’s left.

  • The very last line is the one that calls the function web_scraper on this website.

Try answering these questions about the Python code.

Section 3: Reading HTML

You have built HTML pages, and seen the HTML that gets generated by the blocks. Below are Parsons Problems. For each of these, we give you the HTML, but scrambled. Drag them into the right order, then press the “Check” button to see if you got it right.

Put the blocks into order to define a simple HTML page. The Head comes before the Body, and the Title is inside the Head.

Put the blocks in order to create an HTML page with a body that contains an H2 header, a paragraph, and a link to another page. Only one of the H2 links below is correct – pick the right one, please.

Put the blocks into order to define a simple HTML page. Indent the blocks to show the structure. Note that the “head” comes before the “body” and the title is defined in the head. The body should contain a paragraph with a link in it.

You have attempted of activities on this page