Usage¶
To use Stilpy in a project:
>>> from stilpy import TimeGaps
Minimal example¶
Suppose that you have serveral records of time stored in a list of dictionaries, like this: :
>>> list_dict = [
... {'t_dt':'start','dt': '2019-12-19 10:00:00'},
... {'t_dt':'end', 'dt': '2019-12-19 13:30:20'},
... {'t_dt':'start', 'dt': '2019-12-19 14:30:00'},
... {'t_dt':'start', 'dt': '2019-12-19 15:30:00'},
... {'t_dt':'end', 'dt': '2019-12-19 17:00:35'},
... {'t_dt':'start', 'dt': '2019-12-19 09:00:00'}
... ]
As you can see, these are time intervals. They have a start and an end point, but they are not in the rigth order. The first two elements are correct. But then we have two start points together. And the last one is a start point record that should be on the top of the list because is older than te others. What Stilpy can do for us is to make an iterator with those records, matching the start points with the right end, or giving them an unknown end if they don’t have one of their own.
To do that we need to make an instance of the TimeGaps class.
TimeGaps recieves several parameters. Some of them are optionals. Let’s see
which ones are needed for our example:
- iterable:
An iterable object that contains a list of items. Those items must be dicts or dictlike objects. Lists, tuples and objects with
__dict__atribute are accepted as well.
Every item, must content in itself the next items:
1. Adatetimeobject or a string formatdatetime. In our case, we have the second option.
2. An item that defines if the first element that we just mentioned is an initial or a final time point of a time interval.- tag_loc:
It tells
TimeGapswhere to find the tag that tells if the item is a start point or an end.- i_tag:
The name of the initial time tag in the iterable. Default is
'start'.- f_tag:
The name of the final time tag in the iterable. Default is
'end'.- dt_loc:
The location, inside each element of the iterable, of the
datetimeinformation. It can be a dictionary key or an index, depending on the collection.
The rest of parameters are optionals, and we won’t need them yet.
Now we can call create the TimeGaps object, passing the arguments in
order:
>>> ti = TimeGaps(list_dict, 't_dt', 'start', 'end', 'dt')
Now we have our iterator. Every item is a TimeInterval object with
some attributes like start,
end,
duration and
is_perfect.
You can add any other attribute to the
TimeInterval object, but we will see
that later. Now we are just going to print each element. But first, we will ask
for the sum of the durations of all its intervals. At the same time, we’ll pass
an argument that will be returned if TimeGaps is unable to make the sum
because some interval hasn’t a duration. By default None will be returned.
>>> ti.total_duration('Sorry! Some interval is not perfect')
'Sorry! Some interval is not perfect'
>>> for i, t in enumerate(ti):
... s = t.start if t.start!='' else 'unknown'
... e = t.end if t.end!='' else 'unknown'
... d = t.duration if t.duration!='' else 'unknown'
... print(f'Interval {i + 1} ->')
... print(f'\t\tStart: {s}')
... print(f'\t\tEnd: {e}')
... print(f'\t\tDuration: {d}')
Interval 1 ->
Start: 2019-12-19 09:00:00
End: unknown
Duration: unknown
Interval 2 ->
Start: 2019-12-19 10:00:00
End: 2019-12-19 13:30:20
Duration: 3:30:20
Interval 3 ->
Start: 2019-12-19 14:30:00
End: unknown
Duration: unknown
Interval 4 ->
Start: 2019-12-19 15:30:00
End: 2019-12-19 17:00:35
Duration: 1:30:35
You can see that, by default, an empty property is set to ''.
If we pass a perfect time intervals collection total_duration will return a very different result. Let’s see an example.
>>> perfect_list_dict = [
... {'t_dt':'start','dt': '2019-12-19 10:00:00'},
... {'t_dt':'end', 'dt': '2019-12-19 13:30:20'},
... {'t_dt':'start', 'dt': '2019-12-19 14:30:00'},
... {'t_dt':'end', 'dt': '2019-12-19 17:00:35'},
... ]
>>> ti_p = TimeGaps(perfect_list_dict, 't_dt', 'start', 'end', 'dt')
>>> for t in ti_p:
... print(t.duration)
3:30:20
2:30:35
>>> ti_p.total_duration()
datetime.timedelta(seconds=21655)
>>> print(ti_p.total_duration())
6:00:55
As you can see, this method returns a timedelta object with the sum
of the duration of every TimeInterval.
Time intervals with groups¶
In the previous example, we just got the records that need to be ordered
and put together. But what happens if we have records that belong to
different groups, all together in the same collection? Well, for that we
have the group_by parameter.
Let’s try another example.
Imagine we’re working with the sign-in and sign-out of the employees from the company’s web application. We should have something like this:
>>> keys_dicts = [
... {
... 'name': 'Eve', 'surname': 'Palmer',
... 't_dt':'start', 'dt': '2019-12-19 10:00:00'
... },
... {
... 'name': 'Cecilia', 'surname': 'Park',
... 't_dt':'end', 'dt': '2019-12-19 11:00:05'
... },
... {
... 'name': 'Moses', 'surname': 'Farrel',
... 't_dt':'start', 'dt': '2019-12-19 10:00:05'
... },
... {
... 'name': 'Eve', 'surname': 'Palmer',
... 't_dt':'end', 'dt': '2019-12-19 13:30:20'
... },
... {
... 'name': 'Moses', 'surname': 'Farrel',
... 't_dt':'end', 'dt': '2019-12-19 13:45:15'
... },
... {
... 'name': 'Eve', 'surname': 'Palmer',
... 't_dt':'start', 'dt': '2019-12-19 14:30:00'
... },
... {
... 'name': 'Cecilia', 'surname': 'Park',
... 't_dt':'start', 'dt': '2019-12-19 15:30:00'
... },
... {
... 'name': 'Cecilia', 'surname': 'Park',
... 't_dt':'end', 'dt': '2019-12-19 17:00:35'
... },
... {
... 'name': 'Moses', 'surname': 'Farrel',
... 't_dt':'start', 'dt': '2019-12-19 09:00:00'
... },
... {
... 'name': 'Cecilia', 'surname': 'Park',
... 't_dt':'start', 'dt': '2019-12-19 10:00:02'
... },
... ]
We cannot order these records based only on their temporary value.
If we do that, we’ll be ignoring that every record belongs to a different person.
So we have to use the group_by parameter by saying which keys should use
TimeGaps to order this records. Let’s see how:
For our example we need to group the records by name and surname. group_by
is a keyword argumen and it’s expecting a single element or a collection,
preferred a tuple. So we do it like this:
>>> ti_g = TimeGaps(
... keys_dicts, 't_dt', 'start', 'end', 'dt',
... group_by=('name', 'surname')
... )
But, additionally maybe we want to store that pairs of keys and values
of names and surnames inside of te TimeInterval objects, in order
to differentiate some intervals from others. As we said before group_by
is a keyword argumen. Any other positional argumen used to instanciate the
TimeGaps class different of
iterable, tag_loc, i_tag, f_tag
and dt_loc will be treated as the key for creating the additional
attributes for the TimeInterval objects of a TimeGaps iterator (this
option is not aviable if your are working with an iterable of any collection
that works with index instead of keys, like list, tuples… So if you have a
list of list or a list of tuple, your can use group_by but you can’t add
additionals attributes to the TimeInterval objects). So we can change the
instanciation like this:
>>> ti_g = TimeGaps(
... keys_dicts, 't_dt', 'start', 'end', 'dt',
... 'name', 'surname',
... group_by=('name', 'surname')
... )
Now if we print every element we should see how the TimeInterval objects
has been created by groups, and how they are ordered in the collection.
>>> for i, tg in enumerate(ti_g):
... s = tg.start if tg.start!='' else 'unknown'
... e = tg.end if tg.end!='' else 'unknown'
... d = tg.duration if tg.duration!='' else 'unknown'
... emp = f'{tg.name} {tg.surname}'
... print(f'Interval {i + 1} ->')
... print(f'\t\tEmployee: {emp}')
... print(f'\t\tStart: {s}')
... print(f'\t\tEnd: {e}')
... print(f'\t\tDuration: {d}')
Interval 1 ->
Employee: Moses Farrel
Start: 2019-12-19 09:00:00
End: unknown
Duration: unknown
Interval 2 ->
Employee: Eve Palmer
Start: 2019-12-19 10:00:00
End: 2019-12-19 13:30:20
Duration: 3:30:20
Interval 3 ->
Employee: Cecilia Park
Start: 2019-12-19 10:00:02
End: 2019-12-19 11:00:05
Duration: 1:00:03
Interval 4 ->
Employee: Moses Farrel
Start: 2019-12-19 10:00:05
End: 2019-12-19 13:45:15
Duration: 3:45:10
Interval 5 ->
Employee: Eve Palmer
Start: 2019-12-19 14:30:00
End: unknown
Duration: unknown
Interval 6 ->
Employee: Cecilia Park
Start: 2019-12-19 15:30:00
End: 2019-12-19 17:00:35
Duration: 1:30:35
If we have two records that will be conform an interval add an extra argument this will be the expected behaviour:
if the key is present in both records with the same value, that value will be used to populate the new attribute
if the key is present in both records with different values, the attribute will be populated with a tuple whose first element is the value the start record’s value for that key, and the second will be the end record’s value for that key
if the key is present just in one of the records, its value will be used to populate the attribute
But what happens if we want different iterators, one per element of the group?
Let’s say that we want a iterator for every employee. You can easily have it. In
fact you will get a list of TimeGaps objects, one for each employee. You
just need to call the
grouped_intervals property.
First let’s see the groups that we have, by calling the grouper_tags property.
>>> for gt in ti_g.grouper_tags:
... print(gt)
{'name': 'Cecilia', 'surname': 'Park'}
{'name': 'Eve', 'surname': 'Palmer'}
{'name': 'Moses', 'surname': 'Farrel'}
Now let’s get a list of TimeGaps, one per employee and see what it has
inside.
>>> grouped_ti = ti_g.grouped_intervals
>>> for group in grouped_ti:
... print('Group number:', grouped_ti.index(group) + 1)
... print('Total duration:', group.total_duration('unable'))
... for i, tg in enumerate(group):
... s = tg.start if tg.start!='' else 'unknown'
... e = tg.end if tg.end!='' else 'unknown'
... d = tg.duration if tg.duration!='' else 'unknown'
... emp = f'{tg.name} {tg.surname}'
... print(f'Interval {i + 1} ->')
... print(f'\t\tEmployee: {emp}')
... print(f'\t\tStart: {s}')
... print(f'\t\tEnd: {e}')
... print(f'\t\tDuration: {d}')
Group number: 1
Total duration: 2:30:38
Interval 1 ->
Employee: Cecilia Park
Start: 2019-12-19 10:00:02
End: 2019-12-19 11:00:05
Duration: 1:00:03
Interval 2 ->
Employee: Cecilia Park
Start: 2019-12-19 15:30:00
End: 2019-12-19 17:00:35
Duration: 1:30:35
Group number: 2
Total duration: unable
Interval 1 ->
Employee: Eve Palmer
Start: 2019-12-19 10:00:00
End: 2019-12-19 13:30:20
Duration: 3:30:20
Interval 2 ->
Employee: Eve Palmer
Start: 2019-12-19 14:30:00
End: unknown
Duration: unknown
Group number: 3
Total duration: unable
Interval 1 ->
Employee: Moses Farrel
Start: 2019-12-19 09:00:00
End: unknown
Duration: unknown
Interval 2 ->
Employee: Moses Farrel
Start: 2019-12-19 10:00:05
End: 2019-12-19 13:45:15
Duration: 3:45:10
You can easily see that a TimeGaps iterator has been created for each
employee with the same methods and properties as their TimeGaps object’s
father. And that’s why we could call the
total_duration method
for each group in grouped_ti collection.
Total duration anyway¶
But what happens if you want to display the the duration of a group, even if it’s not perfect? Maybe you just want to dispaly it differently. Well, in those cases you can use the total_duration_anyway method.
Let’s rework the previous example adding this new functionality.
>>> grouped_ti = ti_g.grouped_intervals
>>> # Example with total_duration_anyway() method
... for group in grouped_ti:
... print('Group number:', grouped_ti.index(group) + 1)
... # If there is a perfect duration it will be printed
... if (tot_duration := group.total_duration(False)) != False:
... print('Total duration:', tot_duration)
... # Otherwise, the imperfect duration will be displayed
... else:
... print('Not perfect duration ', group.total_duration_anyway())
... for i, tg in enumerate(group):
... s = tg.start if tg.start!='' else 'unknown'
... e = tg.end if tg.end!='' else 'unknown'
... d = tg.duration if tg.duration!='' else 'unknown'
... emp = f'{tg.name} {tg.surname}'
... print(f'Interval {i + 1} ->')
... print(f'\t\tEmployee: {emp}')
... print(f'\t\tStart: {s}')
... print(f'\t\tEnd: {e}')
... print(f'\t\tDuration: {d}')
...
Group number: 1
Total duration: 2:30:38
Interval 1 ->
Employee: Cecilia Park
Start: 2019-12-19 10:00:02
End: 2019-12-19 11:00:05
Duration: 1:00:03
Interval 2 ->
Employee: Cecilia Park
Start: 2019-12-19 15:30:00
End: 2019-12-19 17:00:35
Duration: 1:30:35
Group number: 2
Not perfect duration 3:30:20
Interval 1 ->
Employee: Eve Palmer
Start: 2019-12-19 10:00:00
End: 2019-12-19 13:30:20
Duration: 3:30:20
Interval 2 ->
Employee: Eve Palmer
Start: 2019-12-19 14:30:00
End: unknown
Duration: unknown
Group number: 3
Not perfect duration 3:45:10
Interval 1 ->
Employee: Moses Farrel
Start: 2019-12-19 09:00:00
End: unknown
Duration: unknown
Interval 2 ->
Employee: Moses Farrel
Start: 2019-12-19 10:00:05
End: 2019-12-19 13:45:15
Duration: 3:45:10
As you can see above, groups 1 and 3 have a perfect duration, and this is displayed
with the label ‘Duration:’. On the other hand, group number 2 has an interval
without a valid duration (unknown), so Stilpy takes the remaining valid
durations, and returns a partial duration, used by our program to display the
result, labeled as ‘Duration not perfect’.