Wednesday, February 12, 2014

Python triple-quoted strings vs. raw strings

Python lets you include text as strings in a number of ways, but picking the right one is important. There are two specific types of strings that I get confused and occasionally need a little reference to sort out: triple-quoted strings and raw strings.

A triple-quoted string has three quotation marks and looks like this:

TRIPLE = """first\nsecond
third"""


whereas a raw string has a preceding letter r and looks like this:

RAWSTRING = r"I want some \nicely\ formatted text"

The triple-quoted string preserves everything in it, including newline characters, and it interprets the backslashes as "escaped" characters like in regular strings. It will include anything but another triple quote, which it interprets as the end. If you printed TRIPLE it would look like this:

first
second
third


Raw strings work a little differently. A raw string won't let you break the string over a line in the middle, and it doesn't interpret escaped characters. It will not convert the "\n" to a newline like ordinary strings or triple-quoted strings will. So if you printed RAWSTRING you would see this:

I want some \nicely\ formatted text

Triple-quoting is very convenient for copying data from somewhere and pasting it into a string when you want to preserve the newlines. Raw srtings are useful for code-related things that may contain backslashes for other reasons, such as regular expressions.

I can use a triple-quoted string to pull the list of users directly from an email and stick it into a string in a Python script, and then I can use a raw string to create a regular expression to pull the usernames out:


#!/usr/bin/env/ python3

import re

DATA = """Holly Martins (HMARTINS)
Anna Schmidt (ASCHMIDT)
Harry Lime (HLIME)"""



REGEX = r"\s\((\w+)\)"

names = re.findall(REGEX,DATA)
names.sort()
for name in names:
     print(name)

Result:

ASCHMIDT
HLIME
HMARTINS

4 comments:

  1. One more thing: a triple quoted string can also be raw, as in

    r"""..."""

    This should treat the backslashes within the triple quoted string as ordinary characters again, and is in fact useful for forming VERBOSE regular expressions, with comments embedded right inside the regular expression. (See document for the re package.)

    ReplyDelete
  2. A computer designed to be used by a single person is defined as a personal computer (PC). While a Mac is a personal computer, systems running the Windows OS are considered PCs by most people. useful source

    ReplyDelete
  3. Mmm.. good to be here in your article or post, whatever, I think I should also work hard for my own website like I see some good and updated working in your site. click here

    ReplyDelete
  4. It was a decent post to be sure. I completely delighted in understanding it in my lunch time. Will definitely come and visit this blog all the more frequently. Much obliged for sharing. visita il sito

    ReplyDelete