Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
833 views
in Technique[技术] by (71.8m points)

algorithm - Find the year with the most number of people alive in Python

Given a list of people with their birth and end years (all between 1900 and 2000), find the year with the most number of people alive.

Here is my somewhat brute-force solution:

def most_populated(population, single=True):
    years = dict()
    for person in population:
        for year in xrange(person[0], person[1]):
            if year in years:
                years[year] += 1
            else:
                years[year] = 0
    return max(years, key=years.get) if single else 
           [key for key, val in years.iteritems() if val == max(years.values())]

print most_populated([(1920, 1939), (1911, 1944),
                      (1920, 1955), (1938, 1939)])
print most_populated([(1920, 1939), (1911, 1944),
                      (1920, 1955), (1938, 1939), (1937, 1940)], False)

I'm trying to find a more efficient way to solve this problem in Python. Both - readability and efficiency counts. Moreover, for some reason my code won't print [1938, 1939] while it should.

Update

Input is a list of tuples, where first element of a tuple is a year when person was born, and second element of a tuple is the year of death.

Update 2

End year (2nd part of tuple) counts as well as a year of the person being alive (so If the person dies in Sept 1939 (we don't care about the month), he is actually alive in 1939, at least part of it). That should fix the 1939' missing in results.

Best solution?

While readability counts in favor of @joran-beasley, for bigger input most efficient algorithm was provided by @njzk2. Thanks @hannes-ovrén for providing analysis in IPython notebook on Gist

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Another solution I just though of:

  • Create 2 tables, birthdates and deathdates.
  • Accumulate birth dates and death dates in those tables.
  • Browse those tables to accumulate the number of alive people at the time.

Grand total complexity is O(n)

Implementation

from collections import Counter

def most_populated(population, single=True):
    birth = map(lambda x: x[0], population)
    death = map(lambda x: x[1] + 1, population)
    b = Counter(birth)
    d = Counter(death)
    alive = 0
    years = {}
    for year in range(min(birth), max(death) + 1):
        alive = alive + b[year] - d[year]
        years[year] = alive
    return max(years, key=years.get) if single else 
           [key for key, val in years.iteritems() if val == max(years.values())]

Better

from collections import Counter
from itertools import accumulate
import operator

def most_populated(population, single=True):
    delta = Counter(x[0] for x in population)
    delta.subtract(Counter(x[1]+1 for x in population))
    start, end = min(delta.keys()), max(delta.keys())
    years = list(accumulate(delta[year] for year in range(start, end)))
    return max(enumerate(years), key=operator.itemgetter(1))[0] + start if single else 
           [i + start for i, val in enumerate(years) if val == max(years)]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...