Python 3 data types

Updated: 11/06/2021 by Computer Hope
Python command

This page describes the standard data type modules available in Python 3, and how to use them.

Overview

The modules described on this page provide a variety of specialized data types such as dates and times, fixed-type arrays, heap queues, synchronized queues, and sets.

Python also provides some built-in data types, in particular, dict, list, set and frozenset, and tuple. The str class is used to hold Unicode strings, and the bytes class is used to hold binary data.

Datetime: Basic date and time types

The datetime module supplies classes for manipulating dates and times in both simple and complex ways. While date and time arithmetic is supported, the focus of the implementation is on efficient attribute extraction for output formatting and manipulation. For related functionality, see also the time and calendar modules.

There are two kinds of date and time objects: “naive” and “aware”.

An aware object has sufficient knowledge of applicable algorithmic and political time adjustments, such as time zone and daylight saving time information, to locate itself relative to other aware objects. An aware object is used to represent a specific moment in time that is not open to interpretation.

A naive object does not contain enough information to unambiguously locate itself relative to other date/time objects. Whether a naive object represents UTC (Coordinated Universal Time), local time, or time in some other timezone is purely up to the program, like it is up to the program whether a particular number represents metres, miles, or mass. Naive objects are easy to understand and to work with, at the cost of ignoring some aspects of reality.

For applications requiring aware objects, datetime and time objects have an optional time zone information attribute, tzinfo, that can be set to an instance of a subclass of the abstract tzinfo class. These tzinfo objects capture information about the offset from UTC time, the time zone name, and whether Daylight Saving Time is in effect. Note that only one concrete tzinfo class, the timezone class, is supplied by the datetime module. The timezone class can represent simple timezones with fixed offset from UTC, such as UTC itself or North American EST and EDT timezones. Supporting timezones at deeper levels of detail is up to the application. The rules for time adjustment across the world are more political than rational, change frequently, and there is no standard suitable for every application aside from UTC.

The datetime module exports the following constants:

datetime.MINYEAR
The smallest year number allowed in a date or datetime object. MINYEAR is 1.
datetime.MAXYEAR
The largest year number allowed in a date or datetime object. MAXYEAR is 9999.

datetime Available Types

class datetime.date
An idealized naive date, assuming the current Gregorian calendar always was, and always will be, in effect. Attributes: year, month, and day.
class datetime.time
An idealized time, independent of any particular day, assuming that every day has exactly 24*60*60 seconds (there is no notion of “leap seconds” here). Attributes: hour, minute, second, microsecond, and tzinfo.
class datetime.datetime
A combination of a date and a time. Attributes: year, month, day, hour, minute, second, microsecond, and tzinfo.
class datetime.timedelta
A duration expressing the difference between two date, time, or datetime instances to microsecond resolution.
class datetime.tzinfo
An abstract base class for time zone information objects. These are used by the datetime and time classes to provide a customizable notion of time adjustment (for example, to account for time zone and/or daylight saving time).
class datetime.timezone
A class that implements the tzinfo abstract base class as a fixed offset from the UTC.

Objects of these types are immutable.

Objects of the date type are always naive.

An object of type time or datetime may be naive or aware. A datetime object d is aware if d.tzinfo is not None and d.tzinfo.utcoffset(d) does not return None. If d.tzinfo is None, or if d.tzinfo is not None but d.tzinfo.utcoffset(d) returns None, d is naive. A time object t is aware if t.tzinfo is not None and t.tzinfo.utcoffset(None) does not return None. Otherwise, t is naive.

The distinction between naive and aware doesn’t apply to timedelta objects.

Subclass relationships:

object
    timedelta
    tzinfo
        timezone
    time
    date
        datetime

datetime.timedelta Objects

A timedelta object represents a duration, the difference between two dates or times.

class datetime.timedelta(days=0,  seconds=0,  microseconds=0,  milliseconds=0,  minutes=0,  hours=0,  weeks=0)
All arguments are optional and default to 0. Arguments may be integers or floats, and may be positive or negative.

Only days, seconds and microseconds are stored internally. Arguments are converted to those units:

  • A millisecond is converted to 1000 microseconds.
  • A minute is converted to 60 seconds.
  • An hour is converted to 3600 seconds.
  • A week is converted to seven days.
and days, seconds and microseconds are then normalized so that the representation is unique, with

  • 0 <= microseconds < 1000000
  • 0 <= seconds < 3600*24 (the number of seconds in one day)
  • -999999999 <= days <= 999999999
If any argument is a float and there are fractional microseconds, the fractional microseconds left over from all arguments are combined and their sum is rounded to the nearest microsecond using round-half-to-even tiebreaker. If no argument is a float, the conversion and normalization processes are exact (no information is lost).

If the normalized value of days lies outside the indicated range, OverflowError is raised.

Note that normalization of negative values may be surprising at first. For example,

>>> from datetime import timedelta>>> d = timedelta(microseconds=-1)>>> (d.days, d.seconds, d.microseconds)(-1, 86399, 999999)

Class attributes are:

timedelta.min The most negative timedelta object, timedelta(-999999999).
timedelta.max The most positive timedelta object, timedelta(days=999999999, hours=23, minutes=59, seconds=59, microseconds=999999).
timedelta.resolution The smallest possible difference between non-equal timedelta objects, timedelta(microseconds=1).

Note that, because of normalization, timedelta.max > -timedelta.min. -timedelta.max is not representable as a timedelta object.

Instance attributes (read-only):

Attribute Value
days Between -999999999 and 999999999 inclusive
seconds Between 0 and 86399 inclusive
microseconds Between 0 and 999999 inclusive

Supported operations:

Operation Result Notes
t1 = t2 + t3 Sum of t2 and t3. Afterwards t1-t2 == t3 and t1-t3 == t2 are true. 1.
t1 = t2 - t3 Difference of t2 and t3. Afterwards t1 == t2 - t3 and t2 == t1 + t3 are true. 1.
t1 = t2 * i or t1 = i * t2 Delta multiplied by an integer. Afterwards t1 // i == t2 is true, provided i != 0. In general, t1 * i == t1 * (i-1) + t1 is true. 1.
t1 = t2 * f or t1 = f * t2 Delta multiplied by a float. The result is rounded to the nearest multiple of timedelta.resolution using round-half-to-even.
f = t2 / t3 Division of t2 by t3. Returns a float object.
t1 = t2 / f or t1 = t2 / i Delta divided by a float or an int. The result is rounded to the nearest multiple of timedelta.resolution using round-half-to-even.
t1 = t2 // i or t1 = t2 // t3 The floor is computed and the remainder (if any) is thrown away. In the second case, an integer is returned. 3.
t1 = t2 % t3 The remainder is computed as a timedelta object. 3.
q, r = divmod(t1, t2) Computes the quotient and the remainder: q = t1 // t2 and r = t1 % t2. q is an integer and r is a timedelta object.
+t1 Returns a timedelta object with the same value. 2.
-t1 equivalent to timedelta(-t1.days, -t1.seconds, -t1.microseconds), and to t1* -1. 1., 4.
abs(t) equivalent to +t when t.days >= 0, and to -t when t.days < 0. 2.
str(t) Returns a string in the form [D day[s], ][H]H:MM:SS[.UUUUUU], where D is negative for negative t. 5.
repr(t) Returns a string in the form datetime.timedelta(D[, S[, U]]), where D is negative for negative t. 5.

Notes:

  • This is exact, but may overflow.
  • This is exact, and cannot overflow.
  • Division by 0 raises ZeroDivisionError.
  • -timedelta.max is not representable as a timedelta object.
  • String representations of timedelta objects are normalized similarly to their internal representation. This leads to somewhat unusual results for negative timedeltas. For example:
>>> timedelta(hours=-5)
datetime.timedelta(-1, 68400)
>>> print(_)
-1 day, 19:00:00

In addition to the operations listed above timedelta objects support certain additions and subtractions with date and datetime objects (see below).

Changed in version 3.2: Floor division and true division of a timedelta object by another timedelta object are now supported, as are remainder operations and the divmod() function. True division and multiplication of a timedelta object by a float object are now supported.

Comparisons of timedelta objects are supported with the timedelta object representing the smaller duration considered to be the smaller timedelta. To stop mixed-type comparisons from falling back to the default comparison by object address, when a timedelta object is compared to an object of a different type, TypeError is raised unless the comparison is == or !=. The latter cases return False or True, respectively.

timedelta objects are hashable (usable as dictionary keys), support efficient pickling, and in Boolean contexts, a timedelta object is considered to be true if and only if it isn’t equal to timedelta(0).

Instance methods:

timedelta.total_seconds()
Return the total number of seconds contained in the duration. Equivalent to td / timedelta(seconds=1).

Note that for very large time intervals (greater than 270 years on most platforms) this method will lose microsecond accuracy.
>>> from datetime import timedelta
>>> year = timedelta(days=365)
>>> another_year = timedelta(weeks=40, days=84, hours=23,
...                          minutes=50, seconds=600)  # adds up to 365 days
>>> year.total_seconds()
31536000.0
>>> year == another_year
True
>>> ten_years = 10 * year
>>> ten_years, ten_years.days // 365
(datetime.timedelta(3650), 10)
>>> nine_years = ten_years - year
>>> nine_years, nine_years.days // 365
(datetime.timedelta(3285), 9)
>>> three_years = nine_years // 3;
>>> three_years, three_years.days // 365
(datetime.timedelta(1095), 3)
>>> abs(three_years - ten_years) == 2 * three_years + year
True

datetime.date Objects

A date object represents a date (year, month and day) in an idealized calendar, the current Gregorian calendar indefinitely extended in both directions. January 1 of year 1 is called day number 1, January 2 of year 1 is called day number 2, and so on. This matches the definition of the “proleptic Gregorian” calendar in Dershowitz and Reingold's book "Calendrical Calculations", where it's the base calendar for all computations. See the book for algorithms for converting between proleptic Gregorian ordinals and other calendar systems.

class datetime.date(year, month, day)
All arguments are required. Arguments may be integers, in the following ranges:

  • MINYEAR <= year <= MAXYEAR
  • 1 <= month <= 12
  • 1 <= day <= number of days in the given month and year
If an argument outside those ranges is given, ValueError is raised.

Other constructors, all class methods:

classmethod date.today()
Return the local date corresponding to the POSIX timestamp, such as is returned by time.time(). This may raise OverflowError, if the timestamp is out of the range of values supported by the platform C localtime() function, and OSError on localtime() failure. It's common for this to be restricted to years from 1970 through 2038. Note that on non-POSIX systems that include leap seconds in their notion of a timestamp, leap seconds are ignored by fromtimestamp().
classmethod date.fromordinal(ordinal)
Return the date corresponding to the proleptic Gregorian ordinal, where January 1 of year 1 has ordinal 1. ValueError is raised unless 1 <= ordinal <= date.max.toordinal(). For any date d, date.fromordinal(d.toordinal()) == d.

Class attributes:

date.min The earliest representable date, date(MINYEAR, 1, 1).
date.max The latest representable date, date(MAXYEAR, 12, 31).
date.resolution The smallest possible difference between non-equal date objects, timedelta(days=1).

Instance attributes (read-only):

date.year Between MINYEAR and MAXYEAR inclusive.
date.month Between 1 and 12 inclusive.
date.day Between 1 and the number of days in the given month of the given year.

Supported operations:

Operation Result Notes
date2 = date1 + timedelta date2 is timedelta.days days removed from date1. 1.
date2 = date1 - timedelta Computes date2 such that date2 + timedelta == date1. 2.
timedelta = date1 - date2 See note 3. 3.
date1 < date2 date1 is considered less than date2 when date1 precedes date2 in time.

Notes:

  • date2 is moved forward in time if timedelta.days > 0, or backward if timedelta.days < 0. Afterward date2 - date1 == timedelta.days. timedelta.seconds and timedelta.microseconds are ignored. OverflowError is raised if date2.year would be smaller than MINYEAR or larger than MAXYEAR.
  • This isn’t quite equivalent to date1 + (-timedelta), because -timedelta in isolation can overflow in cases where date1 - timedelta does not. timedelta.seconds and timedelta.microseconds are ignored.
  • This is exact, and cannot overflow. timedelta.seconds and timedelta.microseconds are 0, and date2 + timedelta == date1 after.
  • In other words, date1 < date2 if and only if date1.toordinal() < date2.toordinal(). To stop comparison from falling back to the default scheme of comparing object addresses, date comparison normally raises TypeError if the other comparand isn’t also a date object. However, NotImplemented is returned instead if the other comparand has a timetuple() attribute. This hook gives other kinds of date objects a chance at implementing mixed-type comparison. If not, when a date object is compared to an object of a different type, TypeError is raised unless the comparison is == or !=. The latter cases return False or True, respectively.

Dates can be used as dictionary keys. In Boolean contexts, all date objects are considered to be true.

Instance methods:

date.replace(year, month, day)
Return a date with the same value, except for those parameters given new values by whichever keyword arguments are specified. For example, if d == date(2002, 12, 31), then d.replace(day=26) == date(2002, 12, 26).
date.timetuple()
Return a time.struct_time such as returned by time.localtime(). The hours, minutes and seconds are 0, and the DST flag is -1. d.timetuple() is equivalent to time.struct_time((d.year, d.month, d.day, 0, 0, 0, d.weekday(), yday, -1)), where yday = d.toordinal() - date(d.year, 1, 1).toordinal() + 1 is the day number in the current year starting with 1 for January 1st.
date.toordinal()
Return the proleptic Gregorian ordinal of the date, where January 1 of year 1 has ordinal 1. For any date object d, date.fromordinal(d.toordinal()) == d.
date.weekday()
Return the day of the week as an integer, where Monday is 0 and Sunday is 6. For example, date(2002, 12, 4).weekday() == 2, a Wednesday. See also isoweekday().
date.isoweekday()
Return the day of the week as an integer, where Monday is 1 and Sunday is 7. For example, date(2002, 12, 4).isoweekday() == 3, a Wednesday. See also weekday(), isocalendar().
date.isocalendar()
Return a 3-tuple, (ISO year, ISO week number, ISO weekday).

The ISO calendar is a widely used variant of the Gregorian calendar. See https://webspace.science.uu.nl/~gent0113/calendar/isocalendar.htm for a good explanation.

The ISO year consists of 52 or 53 full weeks, and where a week starts on a Monday and ends on a Sunday. The first week of an ISO year is the first (Gregorian) calendar week of a year containing a Thursday. This is called week number 1, and the ISO year of that Thursday is the same as its Gregorian year.

For example, 2004 begins on a Thursday, so the first week of ISO year 2004 begins on Monday, 29 Dec 2003 and ends on Sunday, 4 Jan 2004, so that date(2003, 12, 29).isocalendar() == (2004, 1, 1) and date(2004, 1, 4).isocalendar() == (2004, 1, 7).
date.isoformat()
Return a string representing the date in ISO 8601 format, ‘YYYY-MM-DD’. For example, date(2002, 12, 4).isoformat() == '2002-12-04'.
date.__str__()
For a date d, str(d) is equivalent to d.isoformat().
date.ctime()
Return a string representing the date, for example date(2002, 12, 4).ctime() == 'Wed Dec 4 00:00:00 2002'. d.ctime() is equivalent to time.ctime(time.mktime(d.timetuple())) on platforms where the native C ctime() function (which time.ctime() invokes, but which date.ctime() does not invoke) conforms to the C standard.
date.strftime(format)
Return a string representing the date, controlled by an explicit format string. Format codes referring to hours, minutes or seconds see 0 values. For a complete list of formatting directives, see strftime() and strptime() behavior.
date.__format__(format)
Same as date.strftime(). This makes it possible to specify format string for a date object when using str.format(). For a complete list of formatting directives, see strftime() and strptime() behavior.

Example of counting days to an event:

>>> import time
>>> from datetime import date
>>> today = date.today()
>>> today
datetime.date(2007, 12, 5)
>>> today == date.fromtimestamp(time.time())
True
>>> my_birthday = date(today.year, 6, 24)
>>> if my_birthday < today:
...     my_birthday = my_birthday.replace(year=today.year + 1)
>>> my_birthday
datetime.date(2008, 6, 24)
>>> time_to_birthday = abs(my_birthday - today)
>>> time_to_birthday.days
202

Example of working with date:

>>> from datetime import date
>>> d = date.fromordinal(730920) # 730920th day after 1. 1. 0001
>>> d
datetime.date(2002, 3, 11)
>>> t = d.timetuple()
>>> for i in t:     
...     print(i)
2002                # year
3                   # month
11                  # day
0
0
0
0                   # weekday (0 = Monday)
70                  # 70th day in the year
-1
>>> ic = d.isocalendar()
>>> for i in ic:    
...     print(i)
2002                # ISO year
11                  # ISO week number
1                   # ISO day number ( 1 = Monday )
>>> d.isoformat()
'2002-03-11'
>>> d.strftime("%d/%m/%y")
'11/03/02'
>>> d.strftime("%A %d. %B %Y")
'Monday 11. March 2002'
>>> 'The {1} is {0:%d}, the {2} is {0:%B}.'.format(d, "day", "month")
'The day is 11, the month is March.'

datetime.datetime Objects

A datetime object is a single object containing all the information from a date object and a time object. Like a date object, datetime assumes the current Gregorian calendar extended in both directions; like a time object, datetime assumes there are exactly 3600*24 seconds in every day.

Constructor:

class datetime.datetime(year,  month,  day,  hour=0,  minute=0,  second=0,  microsecond=0,  tzinfo=None)
The year, month and day arguments are required. The tzinfo may be None, or an instance of a tzinfo subclass. The remaining arguments may be integers, in the following ranges:

  • MINYEAR <= year <= MAXYEAR
  • 1 <= month <= 12
  • 1 <= day <= number of days in the given month and year
  • 0 <= hour < 24
  • 0 <= minute < 60
  • 0 <= second < 60
  • 0 <= microsecond < 1000000
If an argument outside those ranges is given, ValueError is raised.

Other constructors, all class methods:

datetime.today()
Return the current local datetime, with tzinfo None. This is equivalent to datetime.fromtimestamp(time.time()). See also now(), fromtimestamp().
datetime.now(tz=None)
Return the current local date and time. If optional argument tz is None or not specified, this is like today(), but, if possible, supplies more precision than can be gotten from going through a time.time() timestamp (for example, this may be possible on platforms supplying the C gettimeofday() function).

Else tz must be an instance of a class tzinfo subclass, and the current date and time are converted to tz‘s time zone. In this case, the result is equivalent to tz.fromutc(datetime.utcnow().replace(tzinfo=tz)). See also today(), utcnow().
datetime.utcnow()
Return the current UTC date and time, with tzinfo None. This is like now(), but returns the current UTC date and time, as a naive datetime object. An aware current UTC datetime can be obtained by calling datetime.now(timezone.utc). See also now().
datetime.fromtimestamp(timestamp,  tz=None)
Return the local date and time corresponding to the POSIX timestamp, such as is returned by time.time(). If optional argument tz is None or not specified, the timestamp is converted to the platform's local date and time, and the returned datetime object is naive.

Else tz must be an instance of a class tzinfo subclass, and the timestamp is converted to tz‘s time zone. In this case, the result is equivalent to tz.fromutc(datetime.utcfromtimestamp( timestamp).replace(tzinfo=tz)).

fromtimestamp() may raise OverflowError, if the timestamp is out of the range of values supported by the platform C localtime() or gmtime() functions, and OSError on localtime() or gmtime() failure. It's common for this to be restricted to years in 1970 through 2038. Note that on non-POSIX systems that include leap seconds in their notion of a timestamp, leap seconds are ignored by fromtimestamp(), and then it's possible to have two timestamps differing by a second that yield identical datetime objects. See also utcfromtimestamp().

Changed in version 3.3: Raise OverflowError instead of ValueError if the timestamp is out of the range of values supported by the platform C localtime() or gmtime() functions. Raise OSError instead of ValueError on localtime() or gmtime() failure.
datetime.utcfromtimestamp(timestamp)
Return the UTC datetime corresponding to the POSIX timestamp, with tzinfo None. This may raise OverflowError, if the timestamp is out of the range of values supported by the platform C gmtime() function, and OSError on gmtime() failure. It's common for this to be restricted to years in 1970 through 2038. See also fromtimestamp().

On the POSIX compliant platforms, utcfromtimestamp(timestamp) is equivalent to the following expression:

datetime(1970, 1, 1) + timedelta(seconds=timestamp)
datetime.fromordinal(ordinal)
Return the datetime corresponding to the proleptic Gregorian ordinal, where January 1 of year 1 has ordinal 1. ValueError is raised unless 1 <= ordinal <= datetime.max.toordinal(). The hour, minute, second and microsecond of the result are all 0, and tzinfo is None.
datetime.combine(date,  time)
Return a new datetime object whose date components are equal to the given date object's, and whose time components and tzinfo attributes are equal to the given time object's. For any datetime object d, d == datetime.combine(d.date(), d.timetz()). If date is a datetime object, its time components and tzinfo attributes are ignored.
datetime.strptime(date_string,  format)
Return a datetime corresponding to date_string, parsed according to format. This is equivalent to datetime(*(time.strptime(date_string, format)[0:6])). ValueError is raised if the date_string and format can’t be parsed by time.strptime() or if it returns a value which isn’t a time tuple. For a complete list of formatting directives, see strftime() and strptime() behavior.

Class attributes:

datetime.min The earliest representable datetime, datetime(MINYEAR, 1, 1, tzinfo=None).
datetime.max The latest representable datetime, datetime(MAXYEAR, 12, 31, 23, 59, 59, 999999, tzinfo=None).
datetime.resolution The smallest possible difference between non-equal datetime objects, timedelta(microseconds=1).

Instance attributes (read-only):

datetime.year Between MINYEAR and MAXYEAR inclusive.
datetime.month Between 1 and 12 inclusive.
datetime.day Between 1 and the number of days in the given month of the given year.
datetime.hour In range(24).
datetime.minute In range(60).
datetime.second In range(60).
datetime.microsecond In range(1000000).
datetime.tzinfo The object passed as the tzinfo argument to the datetime constructor, or None if none was passed.

Supported operations:

Operation Result
datetime2 = datetime1 + timedelta 1.
datetime2 = datetime1 - timedelta 2.
timedelta = datetime1 - datetime2 3.
datetime1 < datetime2 Compares datetime to datetime; see 4.
  • datetime2 is a duration of timedelta removed from datetime1, moving forward in time if timedelta.days > 0, or backward if timedelta.days < 0. The result has the same tzinfo attribute as the input datetime, and datetime2 - datetime1 == timedelta after. OverflowError is raised if datetime2.year would be smaller than MINYEAR or larger than MAXYEAR. Note that no time zone adjustments are done even if the input is an aware object.
  • Computes the datetime2 such that datetime2 + timedelta == datetime1. As for addition, the result has the same tzinfo attribute as the input datetime, and no time zone adjustments are done even if the input is aware. This isn’t quite equivalent to datetime1 + (-timedelta), because -timedelta in isolation can overflow in cases where datetime1 - timedelta does not.
  • Subtraction of a datetime from a datetime is defined only if both operands are naive, or if both are aware. If one is aware and the other is naive, TypeError is raised.

    If both are naive, or both are aware and have the same tzinfo attribute, the tzinfo attributes are ignored, and the result is a timedelta object t such that datetime2 + t == datetime1. No time zone adjustments are done in this case.

    If both are aware and have different tzinfo attributes, a-b acts as if a and b were first converted to naive UTC datetimes first. The result is (a.replace(tzinfo=None) - a.utcoffset()) - (b.replace(tzinfo=None) - b.utcoffset()) except that the implementation never overflows.
  • datetime1 is considered less than datetime2 when datetime1 precedes datetime2 in time.

    If one comparand is naive and the other is aware, TypeError is raised if an order comparison is attempted. For equality comparisons, naive instances are never equal to aware instances.

    If both comparands are aware, and have the same tzinfo attribute, the common tzinfo attribute is ignored and the base datetimes are compared. If both comparands are aware and have different tzinfo attributes, the comparands are first adjusted by subtracting their UTC offsets (obtained from self.utcoffset()).

    Note: To stop comparison from falling back to the default scheme of comparing object addresses, datetime comparison normally raises TypeError if the other comparand isn’t also a datetime object. However, NotImplemented is returned instead if the other comparand has a timetuple() attribute. This hook gives other kinds of date objects a chance at implementing mixed-type comparison. If not, when a datetime object is compared to an object of a different type, TypeError is raised unless the comparison is == or !=. The latter cases return False or True, respectively.

datetime objects can be used as dictionary keys. In Boolean contexts, all datetime objects are considered to be true.

Instance methods:

datetime.date()
Return date object with same year, month and day.
datetime.time()
Return time object with same hour, minute, second and microsecond. The tzinfo is None. See also method timetz().
datetime.timetz()
Return time object with same hour, minute, second, microsecond, and tzinfo attributes. See also method time().
datetime.replace([year [, month [, day [, hour [, minute [, second [, microsecond [, tzinfo]]]]]]]])
Return a datetime with the same attributes, except for those attributes given new values by whichever keyword arguments are specified. Note that tzinfo=None can be specified to create a naive datetime from an aware datetime with no conversion of date and time data.
datetime.astimezone(tz=None)
Return a datetime object with new tzinfo attribute tz, adjusting the date and time data so the result is the same UTC time as self, but in tz‘s local time.

If provided, tz must be an instance of a tzinfo subclass, and its utcoffset() and dst() methods must not return None. self must be aware (self.tzinfo must not be None, and self.utcoffset() must not return None).

If called without arguments (or with tz=None) the system local timezone is assumed. The tzinfo attribute of the converted datetime instance will be set to an instance of timezone with the zone name and offset obtained from the OS.

If self.tzinfo is tz, self.astimezone(tz) equals self: no adjustment of date or time data is performed. Else the result is local time in time zone tz, representing the same UTC time as self: after astz = dt.astimezone(tz), astz - astz.utcoffset() will usually have the same date and time data as dt - dt.utcoffset(). The discussion of class tzinfo explains the cases at Daylight Saving Time transition boundaries where this cannot be achieved (an issue only if tz models both standard and daylight time).

If you merely want to attach a time zone object tz to a datetime dt without adjustment of date and time data, use dt.replace(tzinfo=tz). If you merely want to remove the time zone object from an aware datetime dt without conversion of date and time data, use dt.replace(tzinfo=None).

Note that the default tzinfo.fromutc() method can be overridden in a tzinfo subclass to affect the result returned by astimezone(). Ignoring error cases, astimezone() acts like:

def astimezone(self, tz): if self.tzinfo is tz: return self # Convert self to UTC, and attach the new time zone object. utc = (self - self.utcoffset()).replace(tzinfo=tz) # Convert from UTC to tz's local time. return tz.fromutc(utc)
datetime.utcoffset()
If tzinfo is None, returns None, else returns self.tzinfo.utcoffset(self), and raises an exception if the latter doesn’t return None, or a timedelta object representing a whole number of minutes with magnitude less than one day.
datetime.dst()
If tzinfo is None, returns None, else returns self.tzinfo.dst(self), and raises an exception if the latter doesn’t return None, or a timedelta object representing a whole number of minutes with magnitude less than one day.
datetime.tzname()
If tzinfo is None, returns None, else returns self.tzinfo.tzname(self), raises an exception if the latter doesn’t return None or a string object.
datetime.timetuple()
Return a time.struct_time such as returned by time.localtime(). d.timetuple() is equivalent to time.struct_time((d.year, d.month, d.day, d.hour, d.minute, d.second, d.weekday(), yday, dst)), where yday = d.toordinal() - date(d.year, 1, 1).toordinal() + 1 is the day number in the current year starting with 1 for January 1st. The tm_isdst flag of the result is set according to the dst() method: tzinfo is None or dst() returns None, tm_isdst is set to -1; else if dst() returns a non-zero value, tm_isdst is set to 1; else tm_isdst is set to 0.
datetime.utctimetuple()
If datetime instance d is naive, this is the same as d.timetuple() except that tm_isdst is forced to 0 regardless of what d.dst() returns. DST is never in effect for a UTC time.

If d is aware, d is normalized to UTC time, by subtracting d.utcoffset(), and a time.struct_time for the normalized time is returned. tm_isdst is forced to 0. Note that an OverflowError may be raised if d.year was MINYEAR or MAXYEAR and UTC adjustment spills over a year boundary.
datetime.toordinal()
Return the proleptic Gregorian ordinal of the date. The same as self.date().toordinal().
datetime.timestamp()
Return POSIX timestamp corresponding to the datetime instance. The return value is a float similar to that returned by time.time().

Naive datetime instances are assumed to represent local time and this method relies on the platform C mktime() function to perform the conversion. Since datetime supports wider range of values than mktime() on many platforms, this method may raise OverflowError for times far in the past or far in the future.

For aware datetime instances, the return value is computed as:

(dt - datetime(1970, 1, 1, tzinfo=timezone.utc)).total_seconds()
datetime.weekday()
Return the day of the week as an integer, where Monday is 0 and Sunday is 6. The same as self.date().weekday(). See also isoweekday().
datetime.isoweekday()
Return the day of the week as an integer, where Monday is 1 and Sunday is 7. The same as self.date().isoweekday(). See also weekday(), isocalendar().
datetime.isocalendar()
Return a 3-tuple, (ISO year, ISO week number, ISO weekday). The same as self.date().isocalendar().
datetime.isoformat(sep='T')
Return a string representing the date and time in ISO 8601 format, YYYY-MM-DDTHH:MM:SS.mmmmmm or, if microsecond is 0, YYYY-MM-DDTHH:MM:SS

If utcoffset() does not return None, a 6-character string is appended, giving the UTC offset in (signed) hours and minutes: YYYY-MM-DDTHH:MM:SS.mmmmmm+HH:MM or, if microsecond is 0, YYYY-MM-DDTHH:MM:SS+HH:MM

The optional argument sep (default 'T') is a one-character separator, placed between the date and time portions of the result. For example,

>>> from datetime import tzinfo, timedelta, datetime>>> class TZ(tzinfo):...     def utcoffset(self, dt): return timedelta(minutes=-399)...>>> datetime(2002, 12, 25, tzinfo=TZ()).isoformat(' ')'2002-12-25 00:00:00-06:39'
datetime.__str__()
For a datetime instance d, str(d) is equivalent to d.isoformat(' ').
datetime.ctime()
Return a string representing the date and time, for example datetime(2002, 12, 4, 20, 30, 40).ctime() == 'Wed Dec 4 20:30:40 2002'. d.ctime() is equivalent to time.ctime(time.mktime(d.timetuple())) on platforms where the native C ctime() function (which time.ctime() invokes, but which datetime.ctime() does not invoke) conforms to the C standard.
datetime.strftime(format)
Return a string representing the date and time, controlled by an explicit format string. For a complete list of formatting directives, see strftime() and strptime() behavior.
datetime.__format__(format)
Same as datetime.strftime(). This makes it possible to specify format string for a datetime object when using str.format(). For a complete list of formatting directives, see strftime() and strptime() behavior.

Examples of working with datetime objects:

>>> from datetime import datetime, date, time
>>> # Using datetime.combine()
>>> d = date(2005, 7, 14)
>>> t = time(12, 30)
>>> datetime.combine(d, t)
datetime.datetime(2005, 7, 14, 12, 30)
>>> # Using datetime.now() or datetime.utcnow()
>>> datetime.now()   
datetime.datetime(2007, 12, 6, 16, 29, 43, 79043)   # GMT +1
>>> datetime.utcnow()   
datetime.datetime(2007, 12, 6, 15, 29, 43, 79060)
>>> # Using datetime.strptime()
>>> dt = datetime.strptime("21/11/06 16:30", "%d/%m/%y %H:%M")
>>> dt
datetime.datetime(2006, 11, 21, 16, 30)
>>> # Using datetime.timetuple() to get tuple of all attributes
>>> tt = dt.timetuple()
>>> for it in tt:   
...     print(it)
...
2006    # year
11      # month
21      # day
16      # hour
30      # minute
0       # second
1       # weekday (0 = Monday)
325     # number of days since 1st January
-1      # dst - method tzinfo.dst() returned None
>>> # Date in ISO format
>>> ic = dt.isocalendar()
>>> for it in ic:   
...     print(it)
...
2006    # ISO year
47      # ISO week
2       # ISO weekday
>>> # Formatting datetime
>>> dt.strftime("%A, %d. %B %Y %I:%M%p")
'Tuesday, 21. November 2006 04:30PM'
>>> 'The {1} is {0:%d}, the {2} is {0:%B}, the {3} is {0:%I:%M%p}.'.format(dt, "day", "month", "time")
'The day is 21, the month is November, the time is 04:30PM.'

Using datetime with tzinfo:

>>> from datetime import timedelta, datetime, tzinfo
>>> class GMT1(tzinfo):
...     def utcoffset(self, dt):
...         return timedelta(hours=1) + self.dst(dt)
...     def dst(self, dt):
...         # DST starts last Sunday in March
...         d = datetime(dt.year, 4, 1)   # ends last Sunday in October
...         self.dston = d - timedelta(days=d.weekday() + 1)
...         d = datetime(dt.year, 11, 1)
...         self.dstoff = d - timedelta(days=d.weekday() + 1)
...         if self.dston <=  dt.replace(tzinfo=None) < self.dstoff:
...             return timedelta(hours=1)
...         else:
...             return timedelta(0)
...     def tzname(self,dt):
...          return "GMT +1"
...
>>> class GMT2(tzinfo):
...     def utcoffset(self, dt):
...         return timedelta(hours=2) + self.dst(dt)
...     def dst(self, dt):
...         d = datetime(dt.year, 4, 1)
...         self.dston = d - timedelta(days=d.weekday() + 1)
...         d = datetime(dt.year, 11, 1)
...         self.dstoff = d - timedelta(days=d.weekday() + 1)
...         if self.dston <=  dt.replace(tzinfo=None) < self.dstoff:
...             return timedelta(hours=1)
...         else:
...             return timedelta(0)
...     def tzname(self,dt):
...         return "GMT +2"
...
>>> gmt1 = GMT1()
>>> # Daylight Saving Time
>>> dt1 = datetime(2006, 11, 21, 16, 30, tzinfo=gmt1)
>>> dt1.dst()
datetime.timedelta(0)
>>> dt1.utcoffset()
datetime.timedelta(0, 3600)
>>> dt2 = datetime(2006, 6, 14, 13, 0, tzinfo=gmt1)
>>> dt2.dst()
datetime.timedelta(0, 3600)
>>> dt2.utcoffset()
datetime.timedelta(0, 7200)
>>> # Convert datetime to another time zone
>>> dt3 = dt2.astimezone(GMT2())
>>> dt3     
datetime.datetime(2006, 6, 14, 14, 0, tzinfo=<GMT2 object at 0x...>)
>>> dt2     
datetime.datetime(2006, 6, 14, 13, 0, tzinfo=<GMT1 object at 0x...>)
>>> dt2.utctimetuple() == dt3.utctimetuple()
True

datetime.time Objects

A time object represents a (local) time of day, independent of any particular day, and subject to adjustment via a tzinfo object.

class datetime.time(hour=0,  minute=0,  second=0,  microsecond=0,  tzinfo=None)
All arguments are optional. The tzinfo may be None, or an instance of a tzinfo subclass. The remaining arguments may be integers, in the following ranges:

  • 0 <= hour < 24
  • 0 <= minute < 60
  • 0 <= second < 60
  • 0 <= microsecond < 1000000.
If an argument outside those ranges is given, ValueError is raised. All default to 0 except tzinfo, which defaults to None.

Class attributes:

time.min The earliest representable time, time(0, 0, 0, 0).
time.max The latest representable time, time(23, 59, 59, 999999).
time.resolution The smallest possible difference between non-equal time objects, timedelta(microseconds=1), although note that arithmetic on time objects is not supported.

Instance attributes (read-only):

time.hour In range(24).
time.minute In range(60).
time.second In range(60).
time.microsecond In range(1000000).
time.tzinfo The object passed as the tzinfo argument to the time constructor, or None if none was passed.

Supported operations:

  • comparison of time to time, where a is considered less than b when a precedes b in time. If one comparand is naive and the other is aware, TypeError is raised if an order comparison is attempted. For equality comparisons, naive instances are never equal to aware instances.

    If both comparands are aware, and have the same tzinfo attribute, the common tzinfo attribute is ignored and the base times are compared. If both comparands are aware and have different tzinfo attributes, the comparands are first adjusted by subtracting their UTC offsets (obtained from self.utcoffset()). To stop mixed-type comparisons from falling back to the default comparison by object address, when a time object is compared to an object of a different type, TypeError is raised unless the comparison is == or !=. The latter cases return False or True, respectively.
  • hash, use as dict key
  • efficient "pickling" (object serialization)
  • in Boolean contexts, a time object is considered to be true if and only if, after converting it to minutes and subtracting utcoffset() (or 0 if that's None), the result is non-zero.

Instance methods:

time.replace([hour[,  minute [, second [, microsecond [, tzinfo]]]]])
Return a time with the same value, except for those attributes given new values by whichever keyword arguments are specified. Note that tzinfo=None can be specified to create a naive time from an aware time, without conversion of the time data.
time.isoformat()
Return a string representing the time in ISO 8601 format, HH:MM:SS.mmmmmm or, if self.microsecond is 0, HH:MM:SS. If utcoffset() does not return None, a 6-character string is appended, giving the UTC offset in (signed) hours and minutes: HH:MM:SS.mmmmmm+HH:MM or, if self.microsecond is 0, HH:MM:SS+HH:MM
time.__str__()
For a time t, str(t) is equivalent to t.isoformat().
time.strftime(format)
Return a string representing the time, controlled by an explicit format string. For a complete list of formatting directives, see strftime() and strptime() behavior.
time.__format__(format)
Same as time.strftime(). This makes it possible to specify format string for a time object when using str.format(). For a complete list of formatting directives, see strftime() and strptime() behavior.
time.utcoffset()
If tzinfo is None, returns None, else returns self.tzinfo.utcoffset(None), and raises an exception if the latter doesn’t return None or a timedelta object representing a whole number of minutes with magnitude less than one day.
time.dst()
If tzinfo is None, returns None, else returns self.tzinfo.dst(None), and raises an exception if the latter doesn’t return None, or a timedelta object representing a whole number of minutes with magnitude less than one day.
time.tzname()
If tzinfo is None, returns None, else returns self.tzinfo.tzname(None), or raises an exception if the latter doesn’t return None or a string object.

Example:

>>> from datetime import time, tzinfo
>>> class GMT1(tzinfo):
...     def utcoffset(self, dt):
...         return timedelta(hours=1)
...     def dst(self, dt):
...         return timedelta(0)
...     def tzname(self,dt):
...         return "Europe/Prague"
...
>>> t = time(12, 10, 30, tzinfo=GMT1())
>>> t                               
datetime.time(12, 10, 30, tzinfo=<GMT1 object at 0x...>)
>>> gmt = GMT1()
>>> t.isoformat()
'12:10:30+01:00'
>>> t.dst()
datetime.timedelta(0)
>>> t.tzname()
'Europe/Prague'
>>> t.strftime("%H:%M:%S %Z")
'12:10:30 Europe/Prague'
>>> 'The {} is {:%H:%M}.'.format("time", t)
'The time is 12:10.'

datetime.tzinfo Objects

tzinfo is an abstract base class, meaning that this class should not be instantiated directly. You need to derive a concrete subclass, and (at least) supply implementations of the standard tzinfo methods needed by the datetime methods you use. The datetime module supplies a simple concrete subclass of tzinfo timezone that can represent timezones with fixed offset from UTC such as UTC itself or North American EST and EDT.

An instance of (a concrete subclass of) tzinfo can be passed to the constructors for datetime and time objects. The latter objects view their attributes as being in local time, and the tzinfo object supports methods revealing offset of local time from UTC, the name of the time zone, and DST offset, all relative to a date or time object passed to them.

Special requirement for pickling: A tzinfo subclass must have an __init__() method that can be called with no arguments, else it can be pickled but possibly not unpickled again. This is a technical requirement that can relax in the future.

A concrete subclass of tzinfo may need to implement the following methods. Exactly which methods are needed depends on the uses made of aware datetime objects. If in doubt, implement all of them.

tzinfo.utcoffset(dt)

Return offset of local time from UTC, in minutes east of UTC. If local time is west of UTC, this should be negative. Note that this is intended to be the total offset from UTC; for example, if a tzinfo object represents both time zone and DST adjustments, utcoffset() should return their sum. If the UTC offset isn’t known, return None. Else the value returned must be a timedelta object specifying a whole number of minutes in the range -1439 to 1439 inclusive (1440 = 24*60; the magnitude of the offset must be less than one day). Most implementations of utcoffset() will probably look like one of these two:

return CONSTANT                 # fixed-offset classreturn CONSTANT + self.dst(dt)  # daylight-aware class

If utcoffset() does not return None, dst() should not return None either. The default implementation of utcoffset() raises NotImplementedError.

tzinfo.dst(dt)

Return the daylight saving time (DST) adjustment, in minutes east of UTC, or None if DST information isn’t known. Return timedelta(0) if DST is not in effect. If DST is in effect, return the offset as a timedelta object (see utcoffset() for details). Note that DST offset, if applicable, has already been added to the UTC offset returned by utcoffset(), so there's no need to consult dst() unless you’re interested in obtaining DST info separately. For example, datetime.timetuple() calls its tzinfo attribute's dst() method to determine how the tm_isdst flag should be set, and tzinfo.fromutc() calls dst() to account for DST changes when crossing time zones.

An instance tz of a tzinfo subclass that models both standard and daylight times must be consistent in this sense:

tz.utcoffset(dt) - tz.dst(dt)

must return the same result for every datetime dt with dt.tzinfo == tz For sane tzinfo subclasses, this expression yields the time zone's “standard offset”, which should not depend on the date or the time, but only on geographic location. The implementation of datetime.astimezone() relies on this, but cannot detect violations; it's the programmer's responsibility to ensure it. If a tzinfo subclass cannot guarantee this, it may be able to override the default implementation of tzinfo.fromutc() to work correctly with astimezone() regardless.

Most implementations of dst() will probably look like one of these two:

def dst(self, dt):
    # a fixed-offset class:  doesn't account for DST
    return timedelta(0)

or

def dst(self, dt):
    # Code to set dston and dstoff to the time zone's DST
    # transition times based on the input dt.year, and expressed
    # in standard local time.  Then
    if dston <= dt.replace(tzinfo=None) < dstoff:
        return timedelta(hours=1)
    else:
        return timedelta(0)

The default implementation of dst() raises NotImplementedError.

tzinfo.tzname(dt)

Return the time zone name corresponding to the datetime object dt, as a string. Nothing about string names is defined by the datetime module, and there's no requirement that it mean anything in particular. For example, “GMT”, “UTC”, “-500”, “-5:00”, “EDT”, “US/Eastern”, “America/New York” are all valid replies. Return None if a string name isn’t known. Note that this is a method rather than a fixed string primarily because some tzinfo subclasses will want to return different names depending on the specific value of dt passed, especially if the tzinfo class is accounting for daylight time.

The default implementation of tzname() raises NotImplementedError.

These methods are called by a datetime or time object, in response to their methods of the same names. A datetime object passes itself as the argument, and a time object passes None as the argument. A tzinfo subclass's methods should therefore be prepared to accept a dt argument of None, or of class datetime.

When None is passed, it's up to the class designer to decide the best response. For example, returning None is appropriate if the class wants to say that time objects don’t participate in the tzinfo protocols. It may be more useful for utcoffset(None) to return the standard UTC offset, as there is no other convention for discovering the standard offset.

When a datetime object is passed in response to a datetime method, dt.tzinfo is the same object as self. tzinfo methods can rely on this, unless user code calls tzinfo methods directly. The intent is that the tzinfo methods interpret dt as being in local time, and not need worry about objects in other timezones.

There is one more tzinfo method that a subclass may want to override:

tzinfo.fromutc(dt)
This is called from the default datetime.astimezone() implementation. When called from that, dt.tzinfo is self, and dt‘s date and time data are to be viewed as expressing a UTC time. The purpose of fromutc() is to adjust the date and time data, returning an equivalent datetime in self‘s local time.

Most tzinfo subclasses should be able to inherit the default fromutc() implementation without problems. It's strong enough to handle fixed-offset time zones, and time zones accounting for both standard and daylight time, and the latter even if the DST transition times differ in different years. An example of a time zone the defawill nult fromutc() implementation may not handle correctly in all cases is one where the standard offset (from UTC) depends on the specific date and time passed, that can happen for political reasons. The default implementations of astimezone() and fromutc() may not produce the result you want if the result is one of the hours straddling the moment the standard offset changes.

Skipping code for error cases, the default fromutc() implementation acts like:

def fromutc(self, dt): # raise ValueError error if dt.tzinfo is not self dtoff = dt.utcoffset() dtdst = dt.dst() # raise ValueError if dtoff is None or dtdst is None delta = dtoff - dtdst  # this is self's standard offset if delta: dt += delta   # convert to standard local time dtdst = dt.dst() # raise ValueError if dtdst is None if dtdst: return dt + dtdst else: return dt

Example tzinfo classes:

from datetime import tzinfo, timedelta, datetime
ZERO = timedelta(0)
HOUR = timedelta(hours=1)
# A UTC class.
class UTC(tzinfo):
    """UTC"""
    def utcoffset(self, dt):
        return ZERO
    def tzname(self, dt):
        return "UTC"
    def dst(self, dt):
        return ZERO
utc = UTC()
# A class building tzinfo objects for fixed-offset time zones.
# Note that FixedOffset(0, "UTC") is a different way to build a
# UTC tzinfo object.
class FixedOffset(tzinfo):
    """Fixed offset in minutes east from UTC."""
    def __init__(self, offset, name):
        self.__offset = timedelta(minutes=offset)
        self.__name = name
    def utcoffset(self, dt):
        return self.__offset
    def tzname(self, dt):
        return self.__name
    def dst(self, dt):
        return ZERO
# A class capturing the platform's idea of local time.
import time as _time
STDOFFSET = timedelta(seconds = -_time.timezone)
if _time.daylight:
    DSTOFFSET = timedelta(seconds = -_time.altzone)
else:
    DSTOFFSET = STDOFFSET
DSTDIFF = DSTOFFSET - STDOFFSET
class LocalTimezone(tzinfo):
    def utcoffset(self, dt):
        if self._isdst(dt):
            return DSTOFFSET
        else:
            return STDOFFSET
    def dst(self, dt):
        if self._isdst(dt):
            return DSTDIFF
        else:
            return ZERO
    def tzname(self, dt):
        return _time.tzname[self._isdst(dt)]
    def _isdst(self, dt):
        tt = (dt.year, dt.month, dt.day,
              dt.hour, dt.minute, dt.second,
              dt.weekday(), 0, 0)
        stamp = _time.mktime(tt)
        tt = _time.localtime(stamp)
        return tt.tm_isdst > 0
Local = LocalTimezone()
# A complete implementation of current DST rules for major US time zones.
def first_sunday_on_or_after(dt):
    days_to_go = 6 - dt.weekday()
    if days_to_go:
        dt += timedelta(days_to_go)
    return dt
# US DST Rules
#
# This is a simplified (i.e., wrong for a few cases) set of rules for US
# DST start and end times. For a complete and up-to-date set of DST rules
# and timezone definitions, visit the Olson Database (or try pytz):
# http://www.twinsun.com/tz/tz-link.htm
# http://sourceforge.net/projects/pytz/ (might not be up-to-date)
#
# In the US since 2007, DST starts at 2am (standard time) on the second
# Sunday in March, which is the first Sunday on or after Mar 8.
DSTSTART_2007 = datetime(1, 3, 8, 2)
# and ends at 2am (DST time; 1am standard time) on the first Sunday of Nov.
DSTEND_2007 = datetime(1, 11, 1, 1)
# From 1987 to 2006, DST used to start at 2am (standard time) on the first
# Sunday in April and to end at 2am (DST time; 1am standard time) on the last
# Sunday of October, which is the first Sunday on or after Oct 25.
DSTSTART_1987_2006 = datetime(1, 4, 1, 2)
DSTEND_1987_2006 = datetime(1, 10, 25, 1)
# From 1967 to 1986, DST used to start at 2am (standard time) on the last
# Sunday in April (the one on or after April 24) and to end at 2am (DST time;
# 1am standard time) on the last Sunday of October, which is the first Sunday
# on or after Oct 25.
DSTSTART_1967_1986 = datetime(1, 4, 24, 2)
DSTEND_1967_1986 = DSTEND_1987_2006
class USTimeZone(tzinfo):
    def __init__(self, hours, reprname, stdname, dstname):
        self.stdoffset = timedelta(hours=hours)
        self.reprname = reprname
        self.stdname = stdname
        self.dstname = dstname
    def __repr__(self):
        return self.reprname
    def tzname(self, dt):
        if self.dst(dt):
            return self.dstname
        else:
            return self.stdname
    def utcoffset(self, dt):
        return self.stdoffset + self.dst(dt)
    def dst(self, dt):
        if dt is None or dt.tzinfo is None:
            # An exception may be sensible here, in one or both cases.
            # It depends on how you want to treat them.  The default
            # fromutc() implementation (called by the default astimezone()
            # implementation) passes a datetime with dt.tzinfo is self.
            return ZERO
        assert dt.tzinfo is self
        # Find start and end times for US DST. For years before 1967, return
        # ZERO for no DST.
        if 2006 < dt.year:
            dststart, dstend = DSTSTART_2007, DSTEND_2007
        elif 1986 < dt.year < 2007:
            dststart, dstend = DSTSTART_1987_2006, DSTEND_1987_2006
        elif 1966 < dt.year < 1987:
            dststart, dstend = DSTSTART_1967_1986, DSTEND_1967_1986
        else:
            return ZERO
        start = first_sunday_on_or_after(dststart.replace(year=dt.year))
        end = first_sunday_on_or_after(dstend.replace(year=dt.year))
        # Can't compare naive to aware objects, so strip the timezone from
        # dt first.
        if start <= dt.replace(tzinfo=None) < end:
            return HOUR
        else:
            return ZERO
Eastern  = USTimeZone(-5, "Eastern", "EST", "EDT")
Central  = USTimeZone(-6, "Central", "CST", "CDT")
Mountain = USTimeZone(-7, "Mountain", "MST", "MDT")
Pacific  = USTimeZone(-8, "Pacific", "PST", "PDT")

Note that there are unavoidable subtleties twice per year in a tzinfo subclass accounting for both standard and daylight time, at the DST transition points. For concreteness, consider US Eastern (UTC -0500), where EDT begins the minute after 1:59 (EST) on the second Sunday in March, and ends the minute after 1:59 (EDT) on the first Sunday in November:

  UTC   3:MM  4:MM  5:MM  6:MM  7:MM  8:MM
  EST  22:MM 23:MM  0:MM  1:MM  2:MM  3:MM
  EDT  23:MM  0:MM  1:MM  2:MM  3:MM  4:MM
start  22:MM 23:MM  0:MM  1:MM  3:MM  4:MM
  end  23:MM  0:MM  1:MM  1:MM  2:MM  3:MM

When DST starts (the “start” line), the local wall clock leaps from 1:59 to 3:00. A wall time of the form 2:MM doesn’t really make sense on that day, so astimezone(Eastern) won’t deliver a result with hour == 2 on the day DST begins. For astimezone() to make this guarantee, the tzinfo.dst() method must consider times in the “missing hour” (2:MM for Eastern) to be in daylight time.

When DST ends (the “end” line), there's a potentially worse problem: there's an hour that can’t be spelled unambiguously in local wall time: the last hour of daylight time. In Eastern, that's times of the form 5:MM UTC on the day daylight time ends. The local wall clock leaps from 1:59 (daylight time) back to 1:00 (standard time) again. Local times of the form 1:MM are ambiguous. astimezone() mimics the local clock's behavior by mapping two adjacent UTC hours into the same local hour then. In the Eastern example, UTC times of the form 5:MM and 6:MM both map to 1:MM when converted to Eastern. For astimezone() to make this guarantee, the tzinfo.dst() method must consider times in the “repeated hour” to be in standard time. This is easily arranged, as in the example, by expressing DST switch times in the time zone's standard local time.

Applications that can’t bear such ambiguities should avoid using hybrid tzinfo subclasses; there are no ambiguities when using timezone, or any other fixed-offset tzinfo subclass (such as a class representing only EST (fixed offset -5 hours), or only EDT (fixed offset -4 hours)).

datetime.timezone Objects

The timezone class is a subclass of tzinfo, each instance of which represents a timezone defined by a fixed offset from UTC. Note that objects of this class cannot be used to represent timezone information in the locations where different offsets are used in different days of the year or where historical changes have been made to civil time.

class datetime.timezone(offset [, name])
The offset argument must be specified as a timedelta object representing the difference between the local time and UTC. It must be strictly between -timedelta(hours=24) and timedelta(hours=24) and represent a whole number of minutes, otherwise ValueError is raised.

The name argument is optional. If specified it must be a string that is used as the value returned by the tzname(dt) method. Otherwise, tzname(dt) returns a string ‘UTCsHH:MM’, where s is the sign of offset, HH and MM are two digits of offset.hours and offset.minutes respectively.
timezone.utcoffset(dt)
Return the fixed value specified when the timezone instance is constructed. The dt argument is ignored. The return value is a timedelta instance equal to the difference between the local time and UTC.
timezone.tzname(dt)
Return the fixed value specified when the timezone instance is constructed or a string ‘UTCsHH:MM’, where s is the sign of offset, HH and MM are two digits of offset.hours and offset.minutes respectively.
timezone.dst(dt)
Always returns None.
timezone.fromutc(dt)
Return dt + offset. The dt argument must be an aware datetime instance, with tzinfo set to self.

Class attributes:

timezone.utc
The UTC timezone, timezone(timedelta(0)).

strftime() and strptime() Behavior

date, datetime, and time objects all support a strftime(format) method, to create a string representing the time under the control of an explicit format string. Broadly speaking, d.strftime(fmt) acts like the time module's time.strftime(fmt, d.timetuple()) although not all objects support a timetuple() method.

Conversely, the datetime.strptime() class method creates a datetime object from a string representing a date and time and a corresponding format string. datetime.strptime(date_string, format) is equivalent to datetime(*(time.strptime(date_string, format)[0:6])).

For time objects, the format codes for year, month, and day should not be used, as time objects have no such values. If they are used, 1900 is substituted for the year, and 1 for the month and day.

For date objects, the format codes for hours, minutes, seconds, and microseconds should not be used, as date objects have no such values. If they are used, 0 is substituted for them.

The full set of format codes supported varies across platforms, because Python calls the platform C library's strftime() function, and platform variations are common. To see the full set of format codes supported on your platform, consult the strftime documentation.

The following is a list of all the format codes that the C standard (1989 version) requires, and these work on all platforms with a standard C implementation. Note that the 1999 version of the C standard added additional format codes.

Directive Meaning Example Notes
%a Weekday as locale's abbreviated name. Sun, Mon, ..., Sat (en_US);
So, Mo, ..., Sa (de_DE)
1.
%A Weekday as locale's full name. Sunday, Monday, ..., Saturday (en_US);
Sonntag, Montag, ..., Samstag (de_DE)
1.
%w Weekday as a decimal number, where 0 is Sunday and 6 is Saturday. 0, 1, ..., 6
%d Day of the month as a zero-padded decimal number. 01, 02, ..., 31
%b Month as locale's abbreviated name. Jan, Feb, ..., Dec (en_US);
Jan, Feb, ..., Dez (de_DE)
1.
%B Month as locale's full name. January, February, ..., December (en_US);
Januar, Februar, ..., Dezember (de_DE)
1.
%m Month as a zero-padded decimal number. 01, 02, ..., 12
%y Year without century as a zero-padded decimal number. 00, 01, ..., 99
%Y Year with century as a decimal number. 0001, 0002, ..., 2013, 2014, ..., 9998, 9999 2.
%H Hour (24-hour clock) as a zero-padded decimal number. 00, 01, ..., 23
%I Hour (12-hour clock) as a zero-padded decimal number. 01, 02, ..., 12
%p Locale's equivalent of either AM or PM. AM, PM (en_US);
am, pm (de_DE)
1., 3.
%M Minute as a zero-padded decimal number. 00, 01, ..., 59
%S Second as a zero-padded decimal number. 00, 01, ..., 59 4.
%f Microsecond as a decimal number, zero-padded on the left. 000000, 000001, ..., 999999 5.
%z UTC offset in the form +HHMM or -HHMM (empty string if the object is naive). (empty), +0000, -0400, +1030 6.
%Z Time zone name (empty string if the object is naive). (empty), UTC, EST, CST
%j Day of the year as a zero-padded decimal number. 001, 002, ..., 366
%U Week number of the year (Sunday as the first day of the week) as a zero padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0. 00, 01, ..., 53 7.
%W Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0. 00, 01, ..., 53 7.
%c Locale's appropriate date and time representation. Tue Aug 16 21:30:00 1988 (en_US);
Di 16 Aug 21:30:00 1988 (de_DE)
1.
%x Locale's appropriate date representation. 08/16/88 (None);
08/16/1988 (en_US);
16.08.1988 (de_DE)
1.
%X Locale's appropriate time representation. 21:30:00 (en_US);
21:30:00 (de_DE)
1.
%% A literal '%' character. %

Notes:

  • Because the format depends on the current locale, care should be taken when making assumptions about the output value. Field orderings vary (for example, “month/day/year” versus “day/month/year”), and the output may contain Unicode characters encoded using the locale's default encoding (for example, if the current locale is ja_JP, the default encoding could be any one of eucJP, SJIS, or utf-8; use locale.getlocale() to determine the current locale's encoding).
  • The strptime() method can parse years in the full [1, 9999] range, but years < 1000 must be zero-filled to 4-digit width.
  • When used with the strptime() method, the %p directive only affects the output hour field if the %I directive is used to parse the hour.
  • Unlike the time module, the datetime module does not support leap seconds.
  • When used with the strptime() method, the %f directive accepts from one to six digits and zero pads on the right. %f is an extension to the set of format characters in the C standard (but implemented separately in datetime objects, and therefore always available).
  • For a naive object, the %z and %Z format codes are replaced by empty strings. For an aware object: %z: utcoffset() is transformed into a 5-character string of the form +HHMM or -HHMM, where HH is a 2-digit string giving the number of UTC offset hours, and MM is a 2-digit string giving the number of UTC offset minutes. For example, if utcoffset() returns timedelta(hours=-3, minutes=-30), %z is replaced with the string '-0330'.

    %Z: If tzname() returns None, %Z is replaced by an empty string. Otherwise, %Z is replaced by the returned value, which must be a string.
  • When used with the strptime() method, %U and %W are only used in calculations when the day of the week and the year are specified.

Calendar: working with calendar dates

This module allows you to output calendars like the Unix cal program, and provides additional useful functions related to the calendar. By default, these calendars have Monday as the first day of the week, and Sunday as the last (the European convention). Use setfirstweekday() to set the first day of the week to Sunday (6) or to any other weekday. Parameters that specify dates are given as integers. For related functionality, see also the datetime and time modules.

Most of these functions and classes rely on the datetime module which uses an idealized calendar, the current Gregorian calendar extended in both directions. This matches the definition of the “proleptic Gregorian” calendar.

class calendar.Calendar(firstweekday=0)
Creates a Calendar object. firstweekday is an integer specifying the first day of the week. 0 is Monday (the default), 6 is Sunday.

A Calendar object provides several methods that can be used for preparing the calendar data for formatting. This class doesn’t do any formatting itself. This is the job of subclasses.

Calendar instances have the following methods:

iterweekdays()
Return an iterator for the week day numbers that will be used for one week. The first value from the iterator will be the same as the value of the firstweekday property.
itermonthdates(year,  month)
Return an iterator for the month month (1-12) in the year year. This iterator returns all days (as datetime.date objects) for the month and all days before the start of the month or after the end of the month that are required to get a complete week.
itermonthdays2(year,  month)
Return an iterator for the month month in the year year similar to itermonthdates(). Days returned will be tuples consisting of a day number and a week day number.
itermonthdays(year,  month)
Return an iterator for the month month in the year year similar to itermonthdates(). Days returned will be day numbers.
monthdatescalendar(year,  month)
Return a list of the weeks in the month month of the year as full weeks. Weeks are lists of seven datetime.date objects.
monthdays2calendar(year,  month)
Return a list of the weeks in the month month of the year as full weeks. Weeks are lists of seven tuples of day numbers and weekday numbers.
monthdayscalendar(year,  month)
Return a list of the weeks in the month month of the year as full weeks. Weeks are lists of seven day numbers.
yeardatescalendar(year,  width=3)
Return the data for the specified year ready for formatting. The return value is a list of month rows. Each month row contains up to width months (defaulting to 3). Each month contains between 4 and 6 weeks and each week contains 1–7 days. Days are datetime.date objects.
yeardays2calendar(year,  width=3)
Return the data for the specified year ready for formatting (similar to yeardatescalendar()). Entries in the week lists are tuples of day numbers and weekday numbers. Day numbers outside this month are zero.
yeardayscalendar(year,  width=3)
Return the data for the specified year ready for formatting (similar to yeardatescalendar()). Entries in the week lists are day numbers. Day numbers outside this month are zero.

The calendar.TextCalendar Class

class calendar.TextCalendar(firstweekday=0)
This class can generate plain text calendars.

TextCalendar instances have the following methods:

formatmonth(theyear,  themonth,  w=0,  l=0)
Return a month's calendar in a multi-line string. If w is provided, it specifies the width of the date columns, which are centered. If l is given, it specifies the number of lines that each week uses. Depends on the first weekday as specified in the constructor or set by the setfirstweekday() method.
prmonth(theyear,  themonth,  w=0,  l=0)
Print a month's calendar as returned by formatmonth().
formatyear(theyear,  w=2,  l=1,  c=6,  m=3)
Return a m-column calendar for an entire year as a multi-line string. Optional parameters w, l, and c are for date column width, lines per week, and number of spaces between month columns, respectively. Depends on the first weekday as specified in the constructor or set by the setfirstweekday() method. The earliest year for which a calendar can be generated is platform-dependent.
pryear(theyear,  w=2,  l=1,  c=6,  m=3)
Print the calendar for an entire year as returned by formatyear().

The calendar.HTMLCalendar Class

class calendar.HTMLCalendar(firstweekday=0)
This class can generate HTML calendars.

HTMLCalendar instances have the following methods:

formatmonth(theyear,  themonth,  withyear=True)
Return a month's calendar as an HTML table. If withyear is true the year will be included in the header, otherwise only the month name will be used.
formatyear(theyear,  width=3)
Return a year's calendar as an HTML table. width (defaulting to 3) specifies the number of months per row.
formatyearpage(theyear,  width=3,  css='calendar.css',  encoding=None)
Return a year's calendar as a complete HTML page. width (defaulting to 3) specifies the number of months per row. The css is the name for the cascading style sheet to be used. None can be passed if no style sheet should be used. The encoding specifies the encoding to be used for the output (defaulting to the system default encoding).
class calendar.LocaleTextCalendar(firstweekday=0,  locale=None)
This subclass of TextCalendar can be passed a locale name in the constructor and returns month and weekday names in the specified locale. If this locale includes an encoding all strings containing month and weekday names will be returned as unicode.
class calendar.LocaleHTMLCalendar(firstweekday=0,  locale=None)
This subclass of HTMLCalendar can be passed a locale name in the constructor and returns month and weekday names in the specified locale. If this locale includes an encoding all strings containing month and weekday names will be returned as unicode.

Note: The formatweekday() and formatmonthname() methods of these two classes temporarily change the current locale to the given locale. Because the current locale is a process-wide setting, they are not thread-safe.

calendar Functions

For simple text calendars, the calendar module provides the following functions:

calendar.setfirstweekday(weekday)
Sets the weekday (0 is Monday, 6 is Sunday) to start each week. The values MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, SATURDAY, and SUNDAY are provided for convenience. For example, to set the first weekday to Sunday:

import calendarcalendar.setfirstweekday(calendar.SUNDAY)
calendar.firstweekday()
Returns the current setting for the weekday to start each week.
calendar.isleap(year)
Returns True if year is a leap year, otherwise False.
calendar.leapdays(y1, y2)
Returns the number of leap years in the range from y1 to y2 (exclusive), where y1 and y2 are years.

This function works for ranges spanning a century change.
calendar.weekday(year,  month,  day)
Returns the day of the week (0 is Monday) for year (1970–...), month (112), day (131).
calendar.weekheader(n)
Return a header containing abbreviated weekday names. The n specifies the width in characters for one weekday.
icalendar.monthrange(year,  month)
Returns weekday of first day of the month and number of days in month, for the specified year and month.
calendar.monthcalendar(year,  month)
Returns a matrix representing a month's calendar. Each row represents a week; days outside of the month a represented by zeros. Each week begins with Monday unless set by setfirstweekday().
calendar.prmonth(theyear,  themonth,  w=0,  l=0)
Prints a month's calendar as returned by month().
calendar.month(theyear,  themonth,  w=0,  l=0)
Returns a month's calendar in a multi-line string using the formatmonth() of the TextCalendar class.
calendar.prcal(year,  w=0,  l=0,  c=6,  m=3)
Prints the calendar for an entire year as returned by calendar().
calendar.calendar(year,  w=2,  l=1,  c=6,  m=3)
Returns a 3-column calendar for an entire year as a multi-line string using the formatyear() of the TextCalendar class.
calendar.timegm(tuple)
An unrelated but handy function that takes a time tuple such as returned by the gmtime() function in the time module, and returns the corresponding Unix timestamp value, assuming an epoch of 1970, and the POSIX encoding. In fact, time.gmtime() and timegm() are each others’ inverse.

The calendar module exports the following data attributes:

calendar.day_name
An array that represents the days of the week in the current locale.
calendar.day_abbr
An array that represents the abbreviated days of the week in the current locale.
calendar.month_name
An array that represents the months of the year in the current locale. This follows normal convention of January being month number 1, so it has a length of 13 and month_name[0] is the empty string.
calendar.month_abbr
An array that represents the abbreviated months of the year in the current locale. This follows normal convention of January being month number 1, so it has a length of 13 and month_abbr[0] is the empty string.

Collections: container datatypes

This module implements specialized container datatypes providing alternatives to Python's general purpose built-in containers, dict, list, set, and tuple.

collections.ChainMap objects

A ChainMap class is provided for quickly linking many mappings so they can be treated as a single unit. It is often much faster than creating a new dictionary and running multiple update() calls.

The class can simulate nested scopes and is useful in templating.

class collections.ChainMap(*maps)
A ChainMap groups multiple dicts or other mappings together to create a single, updateable view. If no maps are specified, a single empty dictionary is provided so that a new chain always has at least one mapping.

The underlying mappings are stored in a list. That list is public and can accessed or updated using the maps attribute. There is no other state.

Lookups search the underlying mappings successively until a key is found. In contrast, writes, updates, and deletions only operate on the first mapping.

A ChainMap incorporates the underlying mappings by reference. So, if one of the underlying mappings gets updated, those changes will be reflected in ChainMap.

All of the usual dictionary methods are supported. Also, there is a maps attribute, a method for creating new subcontexts, and a property for accessing all but the first mapping:

maps
A user updateable list of mappings. The list is ordered from first-searched to last-searched. It is the only stored state and can be modified to change which mappings are searched. The list should always contain at least one mapping.
new_child(m=None)
Returns a new ChainMap containing a new map followed by all of the maps in the current instance. If m is specified, it becomes the new map at the front of the list of mappings; if not specified, an empty dict is used, so that a call to d.new_child() is equivalent to: ChainMap({}, *d.maps). This method is used for creating subcontexts that can be updated without altering values in any of the parent mappings.
parents
Property returning a new ChainMap containing all of the maps in the current instance except the first one. This is useful for skipping the first map in the search. Use cases are similar to those for the nonlocal keyword used in nested scopes. The use cases also parallel those for the built-in super() function. A reference to d.parents is equivalent to: ChainMap(*d.maps[1:]).

collections.ChainMap Examples and Recipes

This section shows various approaches to working with chained maps.

Example of simulating Python's internal lookup chain:

import builtins
pylookup = ChainMap(locals(), globals(), vars(builtins))

Example of letting user specified command-line arguments take precedence over environment variables which in turn take precedence over default values:

import os, argparse
defaults = {'color': 'red', 'user': 'guest'}
parser = argparse.ArgumentParser()
parser.add_argument('-u', '--user')
parser.add_argument('-c', '--color')
namespace = parser.parse_args()
command_line_args = {k:v for k, v in vars(namespace).items() if v}
combined = ChainMap(command_line_args, os.environ, defaults)
print(combined['color'])
print(combined['user'])

Example patterns for using the ChainMap class to simulate nested contexts:

c = ChainMap()        # Create root context
d = c.new_child()     # Create nested child context
e = c.new_child()     # Child of c, independent from d
e.maps[0]             # Current context dictionary -- like Python's locals()
e.maps[-1]            # Root context -- like Python's globals()
e.parents             # Enclosing context chain -- like Python's nonlocals
d['x']                # Get first key in the chain of contexts
d['x'] = 1            # Set value in current context
del d['x']            # Delete from current context
list(d)               # All nested values
k in d                # Check all nested values
len(d)                # Number of nested values
d.items()             # All nested items
dict(d)               # Flatten into a regular dictionary

The ChainMap class only makes updates (writes and deletions) to the first mapping in the chain while lookups search the full chain. However, if deep writes and deletions are desired, it is easy to make a subclass that updates keys found deeper in the chain:

class DeepChainMap(ChainMap):
    'Variant of ChainMap that allows direct updates to inner scopes'
    def __setitem__(self, key, value):
        for mapping in self.maps:
            if key in mapping:
                mapping[key] = value
                return
        self.maps[0][key] = value
    def __delitem__(self, key):
        for mapping in self.maps:
            if key in mapping:
                del mapping[key]
                return
        raise KeyError(key)
>>> d = DeepChainMap({'zebra': 'black'}, {'elephant': 'blue'}, {'lion': 'yellow'})
>>> d['lion'] = 'orange'         # update an existing key two levels down
>>> d['snake'] = 'red'           # new keys get added to the topmost dict
>>> del d['elephant']            # remove an existing key one level down
DeepChainMap({'zebra': 'black', 'snake': 'red'}, {}, {'lion': 'orange'})

collections.Counter objects

A counter tool is provided to support convenient and rapid tallies. For example:

>>> # Tally occurrences of words in a list
>>> cnt = Counter()
>>> for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
...     cnt[word] += 1
>>> cnt
Counter({'blue': 3, 'red': 2, 'green': 1})
>>> # Find the ten most common words in Hamlet
>>> import re
>>> words = re.findall(r'\w+', open('hamlet.txt').read().lower())
>>> Counter(words).most_common(10)
[('the', 1143), ('and', 966), ('to', 762), ('of', 669), ('i', 631),
 ('you', 554), ('a', 546), ('my', 514), ('hamlet', 471), ('in', 451)]
class collections.Counter([iterable-or-mapping])
A Counter is a dict subclass for counting hashable objects. It is an unordered collection where elements are stored as dictionary keys and their counts are stored as dictionary values. Counts are allowed to be any integer value including zero or negative counts. The Counter class is similar to bags or multisets in other languages.

Elements are counted from an iterable or initialized from another mapping (or counter):

>>> c = Counter()                           # a new, empty counter
>>> c = Counter('gallahad')                 # a new counter from an iterable
>>> c = Counter({'red': 4, 'blue': 2})      # a new counter from a mapping
>>> c = Counter(cats=4, dogs=8)             # a new counter from keyword args

Counter objects have a dictionary interface except that they return a zero count for missing items instead of raising a KeyError:

>>> c = Counter(['eggs', 'ham'])
>>> c['bacon']                              # count of a missing element is zero
0

Setting a count to zero does not remove an element from a counter. Use del to remove it entirely:

>>> c['sausage'] = 0                        # counter entry with a zero count
>>> del c['sausage']                        # del actually removes the entry

Counter objects support three methods beyond those available for all dictionaries:

elements()
Return an iterator over elements repeating each as many times as its count. Elements are returned in arbitrary order. If an element's count is less than one, elements() ignores it.

>>> c = Counter(a=4, b=2, c=0, d=-2)>>> list(c.elements())['a', 'a', 'a', 'a', 'b', 'b']
most_common([n])
Return a list of the n most common elements and their counts from the most common to the least. If n is not specified, most_common() returns all elements in the counter. Elements with equal counts are ordered arbitrarily:

>>> Counter('abracadabra').most_common(3)[('a', 5), ('r', 2), ('b', 2)]
subtract([iterable-or-mapping])
Elements are subtracted from an iterable or from another mapping (or counter). Like dict.update() but subtracts counts instead of replacing them. Both inputs and outputs may be zero or negative.

>>> c = Counter(a=4, b=2, c=0, d=-2)>>> d = Counter(a=1, b=2, c=3, d=4)>>> c.subtract(d)>>> cCounter({'a': 3, 'b': 0, 'c': -3, 'd': -6})

The usual dictionary methods are available for Counter objects except for two which work differently for counters.

fromkeys(iterable)
This class method is not implemented for Counter objects.
update([iterable-or-mapping])
Elements are counted from an iterable or added-in from another mapping (or counter). Like dict.update() but adds counts instead of replacing them. Also, the iterable is expected to be a sequence of elements, not a sequence of (key, value) pairs.

Common patterns for working with Counter objects:

sum(c.values())                 # total of all counts
c.clear()                       # reset all counts
list(c)                         # list unique elements
set(c)                          # convert to a set
dict(c)                         # convert to a regular dictionary
c.items()                       # convert to a list of (elem, cnt) pairs
Counter(dict(list_of_pairs))    # convert from a list of (elem, cnt) pairs
c.most_common()[:-n-1:-1]       # n least common elements
+c                              # remove zero and negative counts

Several mathematical operations are provided for combining Counter objects to produce multisets (counters that have counts greater than zero). Addition and subtraction combine counters by adding or subtracting the counts of corresponding elements. Intersection and union return the minimum and maximum of corresponding counts. Each operation can accept inputs with signed counts, but the output will exclude results with counts of zero or less.

>>> c = Counter(a=3, b=1)
>>> d = Counter(a=1, b=2)
>>> c + d                       # add two counters together:  c[x] + d[x]
Counter({'a': 4, 'b': 3})
>>> c - d                       # subtract (keeping only positive counts)
Counter({'a': 2})
>>> c & d                       # intersection:  min(c[x], d[x])
Counter({'a': 1, 'b': 1})
>>> c | d                       # union:  max(c[x], d[x])
Counter({'a': 3, 'b': 2})

Unary addition and subtraction are shortcuts for adding an empty counter or subtracting from an empty counter.

>>> c = Counter(a=2, b=-4)
>>> +c
Counter({'a': 2})
>>> -c
Counter({'b': 4})

Note: Counters were primarily designed to work with positive integers to represent running counts; however, care was taken to not unnecessarily preclude use cases needing other types or negative values. To help with those use cases, this section documents the minimum range and type restrictions.

  • The Counter class itself is a dictionary subclass with no restrictions on its keys and values. The values are intended to be numbers representing counts, but you could store anything in the value field.
  • The most_common() method requires only that the values be orderable.
  • For in-place operations such as c[key] += 1, the value type need only support addition and subtraction. So fractions, floats, and decimals would work and negative values are supported. The same is also true for update() and subtract() which allow negative and zero values for both inputs and outputs.
  • The multiset methods are designed only for use cases with positive values. The inputs may be negative or zero, but only outputs with positive values are created. There are no type restrictions, but the value type needs to support addition, subtraction, and comparison.
  • The elements() method requires integer counts. It ignores zero and negative counts.

collections.deque objects

class collections.deque([iterable[,  maxlen]])
Returns a new deque object initialized left-to-right (using append()) with data from iterable. If iterable is not specified, the new deque is empty.

Deques are a generalization of stacks and queues (the name is pronounced “deck” and is short for “double-ended queue”). Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction.

Though list objects support similar operations, they are optimized for fast fixed-length operations and incur O(n) memory movement costs for pop(0) and insert(0, v) operations which change both the size and position of the underlying data representation.

If maxlen is not specified or is None, deques may grow to an arbitrary length. Otherwise, the deque is bounded to the specified maximum length. Once a bounded length deque is full, when new items are added, a corresponding number of items are discarded from the opposite end. Bounded length deques provide functionality similar to the tail filter in Unix. They are also useful for tracking transactions and other pools of data where only the most recent activity is of interest.

Deque objects support the following methods:

append(x)
Add x to the right side of the deque.
appendleft(x)
Add x to the left side of the deque.
clear()
Remove all elements from the deque leaving it with length 0.
count(x)
Count the number of deque elements equal to x.
extend(iterable)
Extend the right side of the deque by appending elements from the iterable argument.
extendleft(iterable)
Extend the left side of the deque by appending elements from iterable. Note, the series of left appends results in reversing the order of elements in the iterable argument.
pop()
Remove and return an element from the right side of the deque. If no elements are present, raises an IndexError.
popleft()
Remove and return an element from the left side of the deque. If no elements are present, raises an IndexError.
remove(value)
Removed the first occurrence of value. If not found, raises a ValueError.
reverse()
Reverse the elements of the deque in-place and then return None.
rotate(n)
Rotate the deque n steps to the right. If n is negative, rotate to the left. Rotating one step to the right is equivalent to: d.appendleft(d.pop()).

Deque objects also provide one read-only attribute:

maxlen
Maximum size of a deque or None if unbounded.

In addition to the above, deques support iteration, pickling, len(d), reversed(d), copy.copy(d), copy.deepcopy(d), membership testing with the in operator, and subscript references such as d[-1]. Indexed access is O(1) at both ends but slows to O(n) in the middle. For fast random access, use lists instead.

Example:

>>> from collections import deque
>>> d = deque('ghi')                 # make a new deque with three items
>>> for elem in d:                   # iterate over the deque's elements
...     print(elem.upper())
G
H
I
>>> d.append('j')                    # add a new entry to the right side
>>> d.appendleft('f')                # add a new entry to the left side
>>> d                                # show the representation of the deque
deque(['f', 'g', 'h', 'i', 'j'])
>>> d.pop()                          # return and remove the rightmost item
'j'
>>> d.popleft()                      # return and remove the leftmost item
'f'
>>> list(d)                          # list the contents of the deque
['g', 'h', 'i']
>>> d[0]                             # peek at leftmost item
'g'
>>> d[-1]                            # peek at rightmost item
'i'
>>> list(reversed(d))                # list the contents of a deque in reverse
['i', 'h', 'g']
>>> 'h' in d                         # search the deque
True
>>> d.extend('jkl')                  # add multiple elements at once
>>> d
deque(['g', 'h', 'i', 'j', 'k', 'l'])
>>> d.rotate(1)                      # right rotation
>>> d
deque(['l', 'g', 'h', 'i', 'j', 'k'])
>>> d.rotate(-1)                     # left rotation
>>> d
deque(['g', 'h', 'i', 'j', 'k', 'l'])
>>> deque(reversed(d))               # make a new deque in reverse order
deque(['l', 'k', 'j', 'i', 'h', 'g'])
>>> d.clear()                        # empty the deque
>>> d.pop()                          # cannot pop from an empty deque
Traceback (most recent call last):
    File "<pyshell#6>", line 1, in -toplevel-
        d.pop()
IndexError: pop from an empty deque
>>> d.extendleft('abc')              # extendleft() reverses the input order
>>> d
deque(['c', 'b', 'a'])

collections.deque Recipes

This section shows various approaches to working with deques.

Bounded length deques provide functionality similar to the tail filter in Unix:

def tail(filename, n=10):
    'Return the last n lines of a file'
    with open(filename) as f:
        return deque(f, n)

Another approach to using deques is to maintain a sequence of recently added elements by appending to the right and popping to the left:

def moving_average(iterable, n=3):
    # moving_average([40, 30, 50, 46, 39, 44]) --> 40.0 42.0 45.0 43.0
    # http://en.wikipedia.org/wiki/Moving_average
    it = iter(iterable)
    d = deque(itertools.islice(it, n-1))
    d.appendleft(0)
    s = sum(d)
    for elem in it:
        s += elem - d.popleft()
        d.append(elem)
        yield s / n

The rotate() method provides a way to implement deque slicing and deletion. For example, a pure Python implementation of del d[n] relies on the rotate() method to position elements to be popped:

def delete_nth(d, n):
    d.rotate(-n)
    d.popleft()
    d.rotate(n)

To implement deque slicing, use a similar approach applying rotate() to bring a target element to the left side of the deque. Remove old entries with popleft(), add new entries with extend(), and then reverse the rotation. With minor variations on that approach, it is easy to implement Forth style stack manipulations such as dup, drop, swap, over, pick, rot, and roll.

collections.defaultdict Objects

class collections.defaultdict([default_factory [, ...]])
Returns a new dictionary-like object. defaultdict is a subclass of the built-in dict class. It overrides one method and adds one writable instance variable. The remaining functionality is the same as for the dict class and is not documented here.

The first argument provides the initial value for the default_factory attribute; it defaults to None. All remaining arguments are treated the same as if they were passed to the dict constructor, including keyword arguments.

defaultdict objects support the following method in addition to the standard dict operations:

__missing__(key)
If the default_factory attribute is None, this raises a KeyError exception with the key as argument.

If default_factory is not None, it is called without arguments to provide a default value for the given key, this value is inserted in the dictionary for the key, and returned.

If calling default_factory raises an exception this exception is propagated unchanged.

This method is called by the __getitem__() method of the dict class when the requested key is not found; whatever it returns or raises is then returned or raised by __getitem__().

Note that __missing__() is not called for any operations besides __getitem__(). This means that get() will, like normal dictionaries, return None as a default rather than using default_factory.

defaultdict objects support the following instance variable:

default_factory
This attribute is used by the __missing__() method; it is initialized from the first argument to the constructor, if present, or to None, if absent.

collections.defaultdict Examples

Using list as the default_factory, it is easy to group a sequence of key-value pairs into a dictionary of lists:

>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> d = defaultdict(list)
>>> for k, v in s:
...     d[k].append(v)
...
>>> list(d.items())
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]

When each key is encountered for the first time, it is not already in the mapping; so an entry is automatically created using the default_factory function which returns an empty list. The list.append() operation then attaches the value to the new list. When keys are encountered again, the look-up proceeds normally (returning the list for that key) and the list.append() operation adds another value to the list. This technique is simpler and faster than an equivalent technique using dict.setdefault():

>>> d = {}
>>> for k, v in s:
...     d.setdefault(k, []).append(v)
...
>>> list(d.items())
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]

Setting the default_factory to int makes the defaultdict useful for counting (like a bag or multiset in other languages):

>>> s = 'mississippi'
>>> d = defaultdict(int)
>>> for k in s:
...     d[k] += 1
...
>>> list(d.items())
[('i', 4), ('p', 2), ('s', 4), ('m', 1)]

When a letter is first encountered, it is missing from the mapping, so the default_factory function calls int() to supply a default count of zero. The increment operation then builds up the count for each letter.

The function int() which always returns zero is a special case of constant functions. A faster and more flexible way to create constant functions is to use a lambda function that can supply any constant value (not only zero):

>>> def constant_factory(value):
...     return lambda: value
>>> d = defaultdict(constant_factory('<missing>'))
>>> d.update(name='John', action='ran')
>>> '%(name)s %(action)s to %(object)s' % d
'John ran to <missing>'

Setting the default_factory to set makes the defaultdict useful for building a dictionary of sets:

>>> s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)]
>>> d = defaultdict(set)
>>> for k, v in s:
...     d[k].add(v)
...
>>> list(d.items())
[('blue', {2, 4}), ('red', {1, 3})]

collections.namedtuple() Factory Function for Tuples with Named Fields

Named tuples assign meaning to each position in a tuple and allow for more readable, self-documenting code. They can be used wherever regular tuples are used, and they add the ability to access fields by name instead of position index.

collections.namedtuple(typename,  field_names,  verbose=False,  rename=False)
Returns a new tuple subclass named typename. The new subclass is used to create tuple-like objects that have fields accessible by attribute lookup and being indexable and iterable. Instances of the subclass also have a helpful docstring (with typename and field_names) and a helpful __repr__() method which lists the tuple contents in a name=value format.

The field_names are a single string with each fieldname separated by whitespace and/or commas, for example 'x y' or 'x, y'. Alternatively, field_names are a sequence of strings such as ['x', 'y'].

Any valid Python identifier may be used for a fieldname except for names starting with an underscore. Valid identifiers consist of letters, digits, and underscores but do not start with a digit or underscore and cannot be a keyword such as class, for, return, global, pass, or raise.

If rename is true, invalid fieldnames are automatically replaced with positional names. For example, ['abc', 'def', 'ghi', 'abc'] is converted to ['abc', '_1', 'ghi', '_3'], eliminating the keyword def and the duplicate fieldname abc.

If verbose is true, the class definition is printed after it is built. This option is outdated; instead, it is simpler to print the _source attribute. Named tuple instances do not have per-instance dictionaries, so they are lightweight and require no more memory than regular tuples.
>>> # Basic example
>>> Point = namedtuple('Point', ['x', 'y'])
>>> p = Point(11, y=22)     # instantiate with positional or keyword arguments
>>> p[0] + p[1]             # indexable like the plain tuple (11, 22)
33
>>> x, y = p                # unpack like a regular tuple
>>> x, y
(11, 22)
>>> p.x + p.y               # fields also accessible by name
33
>>> p                       # readable __repr__ with a name=value style
Point(x=11, y=22)

Named tuples are especially useful for assigning field names to result tuples returned by the csv or sqlite3 modules:

EmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title, department, paygrade')
import csv
for emp in map(EmployeeRecord._make, csv.reader(open("employees.csv", "rb"))):
    print(emp.name, emp.title)
import sqlite3
conn = sqlite3.connect('/companydata')
cursor = conn.cursor()
cursor.execute('SELECT name, age, title, department, paygrade FROM employees')
for emp in map(EmployeeRecord._make, cursor.fetchall()):
    print(emp.name, emp.title)

In addition to the methods inherited from tuples, named tuples support three additional methods and two attributes. To prevent conflicts with field names, the method and attribute names start with an underscore.

classmethod somenamedtuple._make(iterable)
Class method that makes a new instance from an existing sequence or iterable.

>>> t = [11, 22]>>> Point._make(t)Point(x=11, y=22)
somenamedtuple._asdict()
Return a new OrderedDict which maps field names to their corresponding values. Note, this method is no longer needed now that the same effect can be achieved using the built-in vars() function:

>>> vars(p)OrderedDict([('x', 11), ('y', 22)])
somenamedtuple._replace(kwargs)
Return a new instance of the named tuple replacing specified fields with new values:

>>> p = Point(x=11, y=22)>>> p._replace(x=33)Point(x=33, y=22)>>> for partnum, record in inventory.items():...     inventory[partnum] = record._replace(price=newprices[partnum], \ timestamp=time.now())
somenamedtuple._source
A string with the pure Python source code used to create the named tuple class. The source makes the named tuple self-documenting. It can be printed, executed using exec(), or saved to a file and imported.
somenamedtuple._fields
Tuple of strings listing the field names. Useful for introspection and for creating new named tuple types from existing named tuples.

>>> p._fields            # view the field names('x', 'y')>>> Color = namedtuple('Color', 'red green blue')>>> Pixel = namedtuple('Pixel', Point._fields + Color._fields)>>> Pixel(11, 22, 128, 255, 0)Pixel(x=11, y=22, red=128, green=255, blue=0)

To retrieve a field whose name is stored in a string, use the getattr() function:

>>> getattr(p, 'x')
11

To convert a dictionary to a named tuple, use the double-star-operator (as described in Unpacking Argument Lists):

>>> d = {'x': 11, 'y': 22}
>>> Point(**d)
Point(x=11, y=22)

Since a named tuple is a regular Python class, it is easy to add or change functionality with a subclass. Here is how to add a calculated field and a fixed-width print format:

>>> class Point(namedtuple('Point', 'x y')):
    __slots__ = ()
    @property
    def hypot(self):
        return (self.x ** 2 + self.y ** 2) ** 0.5
    def __str__(self):
        return 'Point: x=%6.3f  y=%6.3f  hypot=%6.3f' % (self.x, self.y, self.hypot)
>>> for p in Point(3, 4), Point(14, 5/7):
    print(p)
Point: x= 3.000  y= 4.000  hypot= 5.000
Point: x=14.000  y= 0.714  hypot=14.018

The subclass shown above sets __slots__ to an empty tuple. This helps keep memory requirements low by preventing the creation of instance dictionaries.

Subclassing is not useful for adding new, stored fields. Instead, create a new named tuple type from the _fields attribute:

>>> Point3D = namedtuple('Point3D', Point._fields + ('z',))

Default values can be implemented using _replace() to customize a prototype instance:

>>> Account = namedtuple('Account', 'owner balance transaction_count')
>>> default_account = Account('<owner name>', 0.0, 0)
>>> johns_account = default_account._replace(owner='John')
>>> janes_account = default_account._replace(owner='Jane')

Enumerated constants can be implemented with named tuples, but it is simpler and more efficient to use a simple class declaration:

>>> Status = namedtuple('Status', 'open pending closed')._make(range(3))
>>> Status.open, Status.pending, Status.closed
(0, 1, 2)
>>> class Status:
    open, pending, closed = range(3)

collections.OrderedDict objects

Ordered dictionaries are like regular dictionaries but they remember the order that items were inserted. When iterating over an ordered dictionary, the items are returned in the order their keys were first added.

class collections.OrderedDict([items])
Return an instance of a dict subclass, supporting the usual dict methods. An OrderedDict is a dict that remembers the order that keys were first inserted. If a new entry overwrites an existing entry, the original insertion position is left unchanged. Deleting an entry and reinserting it moves it to the end.

methods:

popitem(last=True)
The popitem() method for ordered dictionaries returns and removes a (key, value) pair. The pairs are returned in LIFO order if last is true or FIFO order if false.
move_to_end(key,  last=True)
Move an existing key to either end of an ordered dictionary. The item is moved to the right end if last is true (the default) or to the beginning if last is false. Raises KeyError if the key does not exist:

>>> d = OrderedDict.fromkeys('abcde')>>> d.move_to_end('b')>>> ''.join(d.keys())'acdeb'>>> d.move_to_end('b', last=False)>>> ''.join(d.keys())'bacde'

In addition to the usual mapping methods, ordered dictionaries also support reverse iteration using reversed().

Equality tests between OrderedDict objects are order-sensitive and are implemented as list(od1.items())==list(od2.items()). Equality tests between OrderedDict objects and other Mapping objects are order-insensitive like regular dictionaries. This allows OrderedDict objects to be substituted anywhere a regular dictionary is used.

The OrderedDict constructor and update() method both accept keyword arguments, but their order is lost because Python's function call semantics pass-in keyword arguments using a regular unordered dictionary.

collections.OrderedDict Examples and Recipes

Since an ordered dictionary remembers its insertion order, it can be used in conjunction with sorting to make a sorted dictionary:

>>> # regular unsorted dictionary
>>> d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2}
>>> # dictionary sorted by key
>>> OrderedDict(sorted(d.items(), key=lambda t: t[0]))
OrderedDict([('apple', 4), ('banana', 3), ('orange', 2), ('pear', 1)])
>>> # dictionary sorted by value
>>> OrderedDict(sorted(d.items(), key=lambda t: t[1]))
OrderedDict([('pear', 1), ('orange', 2), ('banana', 3), ('apple', 4)])
>>> # dictionary sorted by length of the key string
>>> OrderedDict(sorted(d.items(), key=lambda t: len(t[0])))
OrderedDict([('pear', 1), ('apple', 4), ('orange', 2), ('banana', 3)])

The new sorted dictionaries maintain their sort order when entries are deleted. But when new keys are added, the keys are appended to the end and the sort is not maintained.

It is also straightforward to create an ordered dictionary variant that remembers the order the keys were last inserted. If a new entry overwrites an existing entry, the original insertion position is changed and moved to the end:

class LastUpdatedOrderedDict(OrderedDict):
    'Store items in the order the keys were last added'
    def __setitem__(self, key, value):
        if key in self:
            del self[key]
        OrderedDict.__setitem__(self, key, value)

An ordered dictionary can be combined with the Counter class so that the counter remembers the order elements are first encountered:

class OrderedCounter(Counter, OrderedDict):
    'Counter that remembers the order elements are first encountered'
    def __repr__(self):
        return '%s(%r)' % (self.__class__.__name__, OrderedDict(self))
    def __reduce__(self):
        return self.__class__, (OrderedDict(self),)

collections.UserDict objects

The class, UserDict acts as a wrapper around dictionary objects. The need for this class is partially supplanted by the ability to subclass directly from dict; however, this class can be easier to work with because the underlying dictionary is accessible as an attribute.

class collections.UserDict([initialdata])
A class that simulates a dictionary. The instance's contents are kept in a regular dictionary, which is accessible via the data attribute of UserDict instances. If initialdata is provided, data is initialized with its contents; note that a reference to initialdata aren't kept, allowing it be used for other purposes.

In addition to supporting the methods and operations of mappings, UserDict instances provide the following attribute:

data
A real dictionary used to store the contents of the UserDict class.

collections.UserList objects

This class acts as a wrapper around list objects. It is a useful base class for your list-like classes that can inherit from them and override existing methods or add new ones. In this way, one can add new behaviors to lists.

The need for this class is partially supplanted by the ability to subclass directly from list; however, this class can be easier to work with because the underlying list is accessible as an attribute.

class collections.UserList([list])
A class that simulates a list. The instance's contents are kept in a regular list, which is accessible via the data attribute of UserList instances. The instance's contents are initially set to a copy of list, defaulting to the empty list []. list is any iterable, for example a real Python list or a UserList object.

In addition to supporting the methods and operations of mutable sequences, UserList instances provide the following attribute:

data
A real list object used to store the contents of the UserList class.

Subclassing requirements: Subclasses of UserList are expected to offer a constructor that can be called with either no arguments or one argument. List operations which return a new sequence attempt to create an instance of the actual implementation class. To do so, it assumes that the constructor can be called with a single parameter, which is a sequence object used as a data source.

If a derived class does not want to comply with this requirement, all of the special methods supported by this class need to be overridden; please consult the sources for information about the methods which need to be provided in that case.

collections.UserString objects

The class, UserString acts as a wrapper around string objects. The need for this class is partially supplanted by the ability to subclass directly from str; however, this class can be easier to work with because the underlying string is accessible as an attribute.

class collections.UserString([sequence])
A class that simulates a string or a Unicode string object. The instance's content is kept in a regular string object, which is accessible via the data attribute of UserString instances. The instance's contents are initially set to a copy of sequence. The sequence is an instance of bytes, str, UserString (or a subclass) or an arbitrary sequence that can be converted into a string using the built-in str() function.

Bisect: array bisection algorithm

This module provides support for maintaining a list in sorted order without having to sort the list after each insertion. For long lists of items with expensive comparison operations, this is an improvement over the more common approach. The module is called bisect because it uses a basic bisection algorithm to do its work. The source code may be most useful as a working example of the algorithm (the boundary conditions are already right!).

The following functions are provided:

bisect.bisect_left(a,  x,  lo=0,  hi=len(a))
Locate the insertion point for x in a to maintain sorted order. The parameters lo and hi may be used to specify a subset of the list which should be considered; by default the entire list is used. If x is already present in a, the insertion point will be before (to the left of) any existing entries. The return value is suitable for use as the first parameter to list.insert() assuming that a is already sorted.

The returned insertion point i partitions the array a into two halves so that all(val < x for val in a[lo:i]) for the left side and all(val >= x for val in a[i:hi]) for the right side.
bisect.bisect_right(a,  x,  lo=0,  hi=len(a))bisect.bisect(a,  x,  lo=0,  hi=len(a))
Similar to bisect_left(), but returns an insertion point which comes after (to the right of) any existing entries of x in a.

The returned insertion point i partitions the array a into two halves so that all(val <= x for val in a[lo:i]) for the left side and all(val >= x for val in a[i:hi]) for the right side.
bisect.insort_left(a, x, lo=0, hi=len(a))
Insert x in a in sorted order. This is equivalent to a.insert(bisect.bisect_left(a, x, lo, hi), x) assuming that a is already sorted. Keep in mind that the O(log n) search is dominated by the slow O(n) insertion step.
bisect.insort_right(a,  x,  lo=0,  hi=len(a))bisect.insort(a,  x,  lo=0,  hi=len(a))
Similar to insort_left(), but inserting x in a after any existing entries of x.

Using bisect To Search Sorted Lists

The above bisect() functions are useful for finding insertion points but can be tricky or awkward to use for common searching tasks. The following five functions show how to transform them into the standard look ups for sorted lists:

def index(a, x):
    'Locate the leftmost value exactly equal to x'
    i = bisect_left(a, x)
    if i != len(a) and a[i] == x:
        return i
    raise ValueError
def find_lt(a, x):
    'Find rightmost value less than x'
    i = bisect_left(a, x)
    if i:
        return a[i-1]
    raise ValueError
def find_le(a, x):
    'Find rightmost value less than or equal to x'
    i = bisect_right(a, x)
    if i:
        return a[i-1]
    raise ValueError
def find_gt(a, x):
    'Find leftmost value greater than x'
    i = bisect_right(a, x)
    if i != len(a):
        return a[i]
    raise ValueError
def find_ge(a, x):
    'Find leftmost item greater than or equal to x'
    i = bisect_left(a, x)
    if i != len(a):
        return a[i]
    raise ValueError

Examples

The bisect() function can be useful for numeric table look ups. This example uses bisect() to look up a letter grade for an exam score (say) based on a set of ordered numeric breakpoints: 90 and up is an ‘A’, 80 to 89 is a ‘B’, and so on:

>>> def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
...     i = bisect(breakpoints, score)
...     return grades[i]
...
>>> [grade(score) for score in [33, 99, 77, 70, 89, 90, 100]]
['F', 'A', 'C', 'C', 'B', 'A', 'A']

Unlike the sorted() function, it does not make sense for the bisect() functions to have key or reversed arguments because that would lead to an inefficient design (successive calls to bisect functions would not “remember” all of the previous key look ups).

Instead, it is better to search a list of precomputed keys to find the index of the record in question:

>>> data = [('red', 5), ('blue', 1), ('yellow', 8), ('black', 0)]
>>> data.sort(key=lambda r: r[1])
>>> keys = [r[1] for r in data]         # precomputed list of keys
>>> data[bisect_left(keys, 0)]
('black', 0)
>>> data[bisect_left(keys, 1)]
('blue', 1)
>>> data[bisect_left(keys, 5)]
('red', 5)
>>> data[bisect_left(keys, 8)]
('yellow', 8)

Array: efficient arrays of numeric values

This module defines an object type that can compactly represent an array of basic values: characters, integers, floating point numbers. Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained. The type is specified at object creation time using a type code, which is a single character. The following type codes are defined:

Type code C Type Python Type Minimum size in bytes Notes
'b' signed char int 1
'B' unsigned char int 1
'u' Py_UNICODE Unicode character 2 1.
'h' signed short int 2
'H' unsigned short int 2
'i' signed int int 2
'I' unsigned int int 2
'l' signed long int 4
'L' unsigned long int 4
'q' signed long long int 8 2.
'Q' unsigned long long int 8 2.
'f' float float 4
'd' double float 8

Notes:

  • The 'u' type code corresponds to Python's obsolete unicode character (Py_UNICODE which is wchar_t). Depending on the platform, it can be 16 bits or 32 bits.

    'u' will be removed together with the rest of the Py_UNICODE API.
  • The 'q' and 'Q' type codes are available only if the platform C compiler used to build Python supports C long long, or, on Windows, __int64.

The actual representation of values is determined by the machine architecture (strictly speaking, by the C implementation). The actual size can be accessed through the itemsize attribute.

The module defines the following types:

class array.array(typecode [, initializer])
A new array whose items are restricted by typecode, and initialized from the optional initializer value, which must be a list, a bytes-like object, or iterable over elements of the appropriate type.

If given a list or string, the initializer is passed to the new array's fromlist(), frombytes(), or fromunicode() method (see below) to add initial items to the array. Otherwise, the iterable initializer is passed to the extend() method.
array.typecodes
A string with all available type codes.

Array objects support the ordinary sequence operations of indexing, slicing, concatenation, and multiplication. When using slice assignment, the assigned value must be an array object with the same type code; in all other cases, TypeError is raised. Array objects also implement the buffer interface, and may be used wherever bytes-like objects are supported.

The following data items and methods are also supported:

array.typecode
The typecode character used to create the array.
array.itemsize
The length in bytes of one array item in the internal representation.
array.append(x)
Append a new item with value x to the end of the array.
array.buffer_info()
Return a tuple (address, length) giving the current memory address and the length in elements of the buffer used to hold array's contents. The size of the memory buffer in bytes can be computed as array.buffer_info() * array.itemsize. This is occasionally useful when working with low-level (and inherently unsafe) I/O interfaces that require memory addresses, such as certain ioctl() operations. The returned numbers are valid as long as the array exists and no length-changing operations are applied to it.

Note: When using array objects from code written in C or C++ (the only way to effectively make use of this information), it makes more sense to use the buffer interface supported by array objects. This method is maintained for backward compatibility and should be avoided in new code.
array.byteswap()
“Byteswap” all items of the array. This is only supported for values which are 1, 2, 4, or 8 bytes in size; for other types of values, RuntimeError is raised. It is useful when reading data from a file written on a machine with a different byte order.
array.count(x)
Return the number of occurrences of x in the array.
array.extend(iterable)
Append items from iterable to the end of the array. If iterable is another array, it must have the same type code; if not, TypeError will be raised. If iterable is not an array, it must be iterable and its elements must be the right type to be appended to the array.
array.frombytes(s)
Appends items from the string s, interpreting the string as an array of machine values (as if it had been read from a file using the fromfile() method).
array.fromfile(f, n)
Read n items (as machine values) from the file object f and append them to the end of the array. If less than n items are available, EOFError is raised, but the items that were available are still inserted into the array. The f must be a real built-in file object; something else with a read() method won’t do.
array.fromlist(list)
Append items from the list. This is equivalent to for x in list: a.append(x) except that if there is a type error, the array is unchanged.
array.fromstring()
Deprecated alias for frombytes().
array.fromunicode(s)
Extends this array with data from the given unicode string. The array must be a type 'u' array; otherwise a ValueError is raised. Use array.frombytes(unicodestring.encode(enc)) to append Unicode data to an array of some other type.
array.index(x)
Return the smallest i such that i is the index of the first occurrence of x in the array.
array.insert(i, x)
Insert a new item with value x in the array before position i. Negative values are treated as being relative to the end of the array.
array.pop([i])
Removes the item with the index i from the array and returns it. The optional argument defaults to -1, so that by default the last item is removed and returned.
array.remove(x)
Remove the first occurrence of x from the array.
array.reverse()
Reverse the order of the items in the array.
array.tobytes()
Convert the array to an array of machine values and return the bytes representation (the same sequence of bytes that would be written to a file by the tofile() method.)
array.tofile(f)
Write all items (as machine values) to the file object f.
array.tolist()
Convert the array to an ordinary list with the same items.
array.tostring()
Deprecated alias for tobytes().
array.tounicode()
Convert the array to a unicode string. The array must be a type 'u' array; otherwise a ValueError is raised. Use array.tobytes().decode(enc) to obtain a unicode string from an array of some other type.

When an array object is printed or converted to a string, it is represented as array(typecode, initializer). The initializer is omitted if the array is empty, otherwise it is a string if the typecode is 'u', otherwise it is a list of numbers. The string is guaranteed to be able to be converted back to an array with the same type and value using eval(), so long as the array() function is imported using from array import array. Examples:

array('l')
array('u', 'hello \u2641')
array('l', [1, 2, 3, 4, 5])
array('d', [1.0, 2.0, 3.14])

Enum: support for enumerations

An enumeration is a set of symbolic names (members) bound to unique, constant values. Within an enumeration, the members can be compared by identity, and the enumeration itself can be iterated over.

enum Module Contents

This module defines two enumeration classes that can define unique sets of names and values: Enum and IntEnum. It also defines one decorator, unique().

class enum.Enum
Base class for creating enumerated constants. See section Functional API for an alternate construction syntax.
class enum.IntEnum
Base class for creating enumerated constants that are also subclasses of int.
enum.unique()
Enum class decorator that ensures only one name is bound to any one value.

Creating an enum.Enum

Enumerations are created using the class syntax, which makes them easy to read and write. To define an enumeration, subclass Enum as follows:

>>> from enum import Enum
>>> class Color(Enum):
...     red = 1
...     green = 2
...     blue = 3
...

Notes:

  • The class Color is an enumeration (or enum)
  • The attributes Color.red, Color.green, etc., are enumeration members (or enum members).
  • The enum members have names and values (the name of Color.red is red, the value of Color.blue is 3, etc.)

Note:

Even though we use the class syntax to create Enums, Enums are not normal Python classes.

Enumeration members have human readable string representations:

>>> print(Color.red)
Color.red

...while their repr has more information:

>>> print(repr(Color.red))
<Color.red: 1>

The type of an enumeration member is the enumeration it belongs to:

>>> type(Color.red)
<enum 'Color'>
>>> isinstance(Color.green, Color)
True
>>>

Enum members also have a property containing their item name:

>>> print(Color.red.name)
red

Enumerations support iteration, in definition order:

>>> class Shake(Enum):
...     vanilla = 7
...     chocolate = 4
...     cookies = 9
...     mint = 3
...
>>> for shake in Shake:
...     print(shake)
...
Shake.vanilla
Shake.chocolate
Shake.cookies
Shake.mint

Enumeration members are hashable, so they can be used in dictionaries and sets:

>>> apples = {}
>>> apples[Color.red] = 'red delicious'
>>> apples[Color.green] = 'granny smith'
>>> apples == {Color.red: 'red delicious', Color.green: 'granny smith'}
True

Programmatic Access To Enumeration Members And Their Attributes

Sometimes it's useful to access members in enumerations programmatically (i.e., situations where Color.red won’t do because the exact color is not known at program-writing time). Enum allows such access:

>>> Color(1)
<Color.red: 1>
>>> Color(3)
<Color.blue: 3>

To access enum members by name, use item access:

>>> Color['red']
<Color.red: 1>
>>> Color['green']
<Color.green: 2>

If you have an enum member and need its name or value:

>>> member = Color.red
>>> member.name
'red'
>>> member.value
1

Duplicating enum Members And Values

Having two enum members with the same name is invalid:

>>> class Shape(Enum):
...     square = 2
...     square = 3
...
Traceback (most recent call last):
...
TypeError: Attempted to reuse key: 'square'

However, two enum members are allowed to have the same value. Given two members A and B with the same value (and A defined first), B is an alias to A. By-value lookup of the value of A and B returns A. By-name lookup of B also return A:

>>> class Shape(Enum):
...     square = 2
...     diamond = 1
...     circle = 3
...     alias_for_square = 2
...
>>> Shape.square
<Shape.square: 2>
>>> Shape.alias_for_square
<Shape.square: 2>
>>> Shape(2)
<Shape.square: 2>

Note: Attempting to create a member with the same name as an already defined attribute (another member, a method, etc.) or attempting to create an attribute with the same name as a member is not allowed.

Ensuring Unique Enumeration Values

By default, enumerations allow multiple names as aliases for the same value. When this behavior isn’t desired, the following decorator can ensure each value is used only once in the enumeration:

@enum.unique

A class decorator specifically for enumerations. It searches an enumeration's __members__ gathering any aliases it finds; if any are found ValueError is raised with the details:

>>> from enum import Enum, unique
>>> @unique
... class Mistake(Enum):
...     one = 1
...     two = 2
...     three = 3
...     four = 3
...
Traceback (most recent call last):
...
ValueError: duplicate values found in <enum 'Mistake'>: four -> three

Iteration

Iterating over the members of an enum does not provide the aliases:

>>> list(Shape)
[<Shape.square: 2>, <Shape.diamond: 1>, <Shape.circle: 3>]

The special attribute __members__ is an ordered dictionary mapping names to members. It includes all names defined in the enumeration, including the aliases:

>>> for name, member in Shape.__members__.items():
...     name, member
...
('square', <Shape.square: 2>)
('diamond', <Shape.diamond: 1>)
('circle', <Shape.circle: 3>)
('alias_for_square', <Shape.square: 2>)

The __members__ attribute can be used for detailed programmatic access to the enumeration members. For example, finding all the aliases:

>>> [name for name, member in Shape.__members__.items() if member.name != name]
['alias_for_square']

Comparisons

Enumeration members are compared by identity:

>>> Color.red is Color.red
True
>>> Color.red is Color.blue
False
>>> Color.red is not Color.blue
True

Ordered comparisons between enumeration values are not supported. Enum members are not integers (but see IntEnum below):

>>> Color.red < Color.blue
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unorderable types: Color() < Color()

Equality comparisons are defined though:

>>> Color.blue == Color.red
False
>>> Color.blue != Color.red
True
>>> Color.blue == Color.blue
True

Comparisons against non-enumeration values always compare not equal (again, IntEnum was explicitly designed to behave differently, see below):

>>> Color.blue == 2
False

Allowed Members And Attributes Of Enumerations

The examples above use integers for enumeration values. Using integers is short and handy (and provided by default by the Functional API), but not strictly enforced. In the vast majority of use-cases, one doesn’t care what the actual value of an enumeration is. But if the value is important, enumerations can have arbitrary values.

Enumerations are Python classes, and can have methods and special methods as usual. If we have this enumeration:

>>> class Mood(Enum):
...     funky = 1
...     happy = 3
...
...     def describe(self):
...         # self is the member here
...         return self.name, self.value
...
...     def __str__(self):
...         return 'my custom str! {0}'.format(self.value)
...
...     @classmethod
...     def favorite_mood(cls):
...         # cls here is the enumeration
...         return cls.happy
...

Then:

>>> Mood.favorite_mood()
<Mood.happy: 3>
>>> Mood.happy.describe()
('happy', 3)
>>> str(Mood.funky)
'my custom str! 1'

The rules for what is allowed are as follows: _sunder_ names (starting and ending with a single underscore) are reserved by enum and cannot be used; all other attributes defined within an enumeration become members of this enumeration, except for __dunder__ names and descriptors (methods are also descriptors).

Note: if your enumeration defines __new__() and/or __init__() then whatever value(s) were given to the enum member will be passed into those methods.

Restricted Subclassing Of Enumerations

Subclassing an enumeration is allowed only if the enumeration does not define any members. So this is forbidden:

>>> class MoreColor(Color):
...     pink = 17
...
Traceback (most recent call last):
...
TypeError: Cannot extend enumerations

But this is allowed:

>>> class Foo(Enum):
...     def some_behavior(self):
...         pass
...
>>> class Bar(Foo):
...     happy = 1
...     sad = 2
...

Allowing subclassing of enums that define members would lead to a violation of some important invariants of types and instances. On the other hand, it makes sense to allow sharing some common behavior between a group of enumerations. (See OrderedEnum for an example.)

Pickling

Enumerations can be pickled and unpickled:

>>> from test.test_enum import Fruit
>>> from pickle import dumps, loads
>>> Fruit.tomato is loads(dumps(Fruit.tomato))
True

The usual restrictions for pickling apply: picklable enums must be defined in the top level of a module since unpickling requires them to be importable from that module.

Note:

With pickle protocol version 4 it is possible to easily pickle enums nested in other classes.

It is possible to modify how Enum members are pickled/unpickled by defining __reduce_ex__() in the enumeration class.

enum Functional API

The Enum class is callable, providing the following functional API:

>>> Animal = Enum('Animal', 'ant bee cat dog')
>>> Animal
<enum 'Animal'>
>>> Animal.ant
<Animal.ant: 1>
>>> Animal.ant.value
1
>>> list(Animal)
[<Animal.ant: 1>, <Animal.bee: 2>, <Animal.cat: 3>, <Animal.dog: 4>]

The semantics of this API resemble namedtuple. The first argument of the call to Enum is the name of the enumeration.

The second argument is the source of enumeration member names. It is a whitespace-separated string of names, a sequence of names, a sequence of 2-tuples with key/value pairs, or a mapping (e.g., dictionary) of names to values. The last two options enable assigning arbitrary values to enumerations; the others auto-assign increasing integers starting with 1. A new class derived from Enum is returned. In other words, the above assignment to Animal is equivalent to:

>>> class Animals(Enum):
...     ant = 1
...     bee = 2
...     cat = 3
...     dog = 4
...

The reason for defaulting to 1 as the starting number and not 0 is that 0 is False in a boolean sense, but enum members all evaluate to True.

Pickling enums created with the functional API can be tricky as frame stack implementation details are used to try and figure out the module of the enumeration. The solution is to specify the module name explicitly as follows:

>>> Animals = Enum('Animals', 'ant bee cat dog', module=__name__)
Warning

If module is not supplied, and Enum cannot determine what it is, the new Enum members isn't unpicklable; to keep errors closer to the source, pickling will be disabled.

The new pickle protocol 4 also, in some circumstances, relies on __qualname__ being set to the location where pickle can find the class. For example, if the class was made available in class SomeData in the global scope:

>>> Animals = Enum('Animals', 'ant bee cat dog', qualname='SomeData.Animals')

The complete signature is:

Enum(value='NewEnumName', names=<...>, *, module='...', qualname='...', type=<mixed-in class>)

Derived enumerations

IntEnum

A variation of Enum is provided that is also a subclass of int. Members of an IntEnum can be compared to integers; by extension, integer enumerations of different types can also be compared to each other:

>>> from enum import IntEnum
>>> class Shape(IntEnum):
...     circle = 1
...     square = 2
...
>>> class Request(IntEnum):
...     post = 1
...     get = 2
...
>>> Shape == 1
False
>>> Shape.circle == 1
True
>>> Shape.circle == Request.post
True

However, they still can’t be compared to standard Enum enumerations:

>>> class Shape(IntEnum):
...     circle = 1
...     square = 2
...
>>> class Color(Enum):
...     red = 1
...     green = 2
...
>>> Shape.circle == Color.red
False

IntEnum values behave like integers in other ways you’d expect:

>>> int(Shape.circle)
1
>>> ['a', 'b', 'c'][Shape.circle]
'b'
>>> [i for i in range(Shape.square)]
[0, 1]

For the vast majority of code, Enum is strongly recommended since IntEnum breaks some semantic promises of an enumeration (by being comparable to integers, and thus by transitivity to other unrelated enumerations). It should be used only in special cases where there's no other choice; for example, when integer constants are replaced with enumerations and backward compatibility is required with code that still expects integers.

Other Enums

While IntEnum is part of the enum module, it would be very simple to implement independently:

class IntEnum(int, Enum):
    pass

This demonstrates how similar derived enumerations can be defined; for example a StrEnum that mixes in str instead of int.

Some rules:

  • When subclassing Enum, mix-in types must appear before Enum itself in the sequence of bases, as in the IntEnum example above.
  • While Enum can have members of any type, once you mix in an additional type, all the members must have values of that type, e.g., int above. This restriction does not apply to mix-ins which only add methods and don’t specify another data type such as int or str.
  • When another data type is mixed in, the value attribute is not the same as the enum member itself, although it is equivalent and will compare equal.
  • %-style formatting: %s and %r call Enum‘s __str__() and __repr__() respectively; other codes (such as %i or %h for IntEnum) treat the enum member as its mixed-in type.
  • str.__format__() (or format()) use the mixed-in type's __format__(). If the Enum‘s str() or repr() is desired use the !s or !r str format codes.

Examples

While Enum and IntEnum are expected to cover the majority of use-cases, they cannot cover them all. Here are recipes for some different types of enumerations that can be used directly, or as examples for creating one's own.

enum Example: AutoNumber

Avoids having to specify the value for each enumeration member:

>>> class AutoNumber(Enum):
...     def __new__(cls):
...         value = len(cls.__members__) + 1
...         obj = object.__new__(cls)
...         obj._value_ = value
...         return obj
...
>>> class Color(AutoNumber):
...     red = ()
...     green = ()
...     blue = ()
...
>>> Color.green.value == 2
True

Note: The __new__() method, if defined, is used during creation of the Enum members; it is then replaced by Enum's __new__() which is used after class creation for look up of existing members. Due to the way Enums are supposed to behave, there is no way to customize Enum's __new__().

enum example: OrderedEnum

An ordered enumeration that is not based on IntEnum and so maintains the normal Enum invariants (such as not being comparable to other enumerations):

>>> class OrderedEnum(Enum):
...     def __ge__(self, other):
...         if self.__class__ is other.__class__:
...             return self.value >= other.value
...         return NotImplemented
...     def __gt__(self, other):
...         if self.__class__ is other.__class__:
...             return self.value > other.value
...         return NotImplemented
...     def __le__(self, other):
...         if self.__class__ is other.__class__:
...             return self.value <= other.value
...         return NotImplemented
...     def __lt__(self, other):
...         if self.__class__ is other.__class__:
...             return self.value < other.value
...         return NotImplemented
...
>>> class Grade(OrderedEnum):
...     A = 5
...     B = 4
...     C = 3
...     D = 2
...     F = 1
...
>>> Grade.C < Grade.A
True

enum Example: DuplicateFreeEnum

Raises an error if a duplicate member name is found instead of creating an alias:

>>> class DuplicateFreeEnum(Enum):
...     def __init__(self, *args):
...         cls = self.__class__
...         if any(self.value == e.value for e in cls):
...             a = self.name
...             e = cls(self.value).name
...             raise ValueError(
...                 "aliases not allowed in DuplicateFreeEnum:  %r --> %r"
...                 % (a, e))
...
>>> class Color(DuplicateFreeEnum):
...     red = 1
...     green = 2
...     blue = 3
...     grene = 2
...
Traceback (most recent call last):
...
ValueError: aliases not allowed in DuplicateFreeEnum:  'grene' --> 'green'

Note:

This is a useful example for subclassing Enum to add or change other behaviors and disallowing aliases. If the only desired change is disallowing aliases, the unique() decorator can be used instead.

enum Example: Planet

If __new__() or __init__() is defined the value of the enum member will be passed to those methods:

>>> class Planet(Enum):
...     MERCURY = (3.303e+23, 2.4397e6)
...     VENUS   = (4.869e+24, 6.0518e6)
...     EARTH   = (5.976e+24, 6.37814e6)
...     MARS    = (6.421e+23, 3.3972e6)
...     JUPITER = (1.9e+27, 7.1492e7)
...     SATURN  = (5.688e+26, 6.0268e7)
...     URANUS  = (8.686e+25, 2.5559e7)
...     NEPTUNE = (1.024e+26, 2.4746e7)
...     def __init__(self, mass, radius):
...         self.mass = mass       # in kilograms
...         self.radius = radius   # in meters
...     @property
...     def surface_gravity(self):
...         # universal gravitational constant  (m3 kg-1 s-2)
...         G = 6.67300E-11
...         return G * self.mass / (self.radius * self.radius)
...
>>> Planet.EARTH.value
(5.976e+24, 6378140.0)
>>> Planet.EARTH.surface_gravity
9.802652743337129