.. |br| raw:: html
.. _timedelta: https://docs.python.org/3.8/library/datetime.html#timedelta-objects
=====
Usage
=====
To use Stilpy in a project:
>>> from stilpy import TimeGaps
Minimal example
-----------------
Suppose that you have serveral records of time stored in a list of dictionaries, like this:
:
>>> list_dict = [
... {'t_dt':'start','dt': '2019-12-19 10:00:00'},
... {'t_dt':'end', 'dt': '2019-12-19 13:30:20'},
... {'t_dt':'start', 'dt': '2019-12-19 14:30:00'},
... {'t_dt':'start', 'dt': '2019-12-19 15:30:00'},
... {'t_dt':'end', 'dt': '2019-12-19 17:00:35'},
... {'t_dt':'start', 'dt': '2019-12-19 09:00:00'}
... ]
As you can see, these are time intervals. They have a start and an end point, but
they are not in the rigth order. The first two elements are correct. But then we
have two start points together. And the last one is a start point record that
should be on the top of the list because is older than te others.
What Stilpy can do for us is to make an iterator with those records, matching the
start points with the right end, or giving them an unknown end if they don't have
one of their own.
To do that we need to make an instance of the
`TimeGaps `__ class.
``TimeGaps`` recieves several parameters. Some of them are optionals. Let's see
which ones are needed for our example:
**iterable:**
An iterable object that contains a list of items.
Those items must be dicts or dictlike objects. Lists,
tuples and objects with ``__dict__`` atribute are accepted as
well. |br| Every item, must content in itself the next items: |br|
1. A ``datetime`` object or a string format ``datetime``. In our
case, we have the second option. |br|
2. An item that defines if the first element that we just mentioned
is an initial or a final time point of a time interval.
**tag_loc:**
It tells ``TimeGaps`` where to find the tag that tells if the
item is a start point or an end.
**i_tag:**
The name of the initial time tag in the iterable. Default is ``'start'``.
**f_tag:**
The name of the final time tag in the iterable. Default is ``'end'``.
**dt_loc:**
The location, inside each element of the iterable, of the
``datetime`` information. It can be a dictionary key or an index,
depending on the collection.
The rest of parameters are optionals, and we won't need them yet.
Now we can call create the ``TimeGaps`` object, passing the arguments in
order:
>>> ti = TimeGaps(list_dict, 't_dt', 'start', 'end', 'dt')
Now we have our iterator. Every item is a ``TimeInterval`` object with
some attributes like `start `__,
`end `__,
`duration `__ and
`is_perfect `__.
You can add any other attribute to the
`TimeInterval object `__, but we will see
that later. Now we are just going to print each element. But first, we will ask
for the sum of the durations of all its intervals. At the same time, we'll pass
an argument that will be returned if ``TimeGaps`` is unable to make the sum
because some interval hasn't a duration. By default ``None`` will be returned.
>>> ti.total_duration('Sorry! Some interval is not perfect')
'Sorry! Some interval is not perfect'
>>> for i, t in enumerate(ti):
... s = t.start if t.start!='' else 'unknown'
... e = t.end if t.end!='' else 'unknown'
... d = t.duration if t.duration!='' else 'unknown'
... print(f'Interval {i + 1} ->')
... print(f'\t\tStart: {s}')
... print(f'\t\tEnd: {e}')
... print(f'\t\tDuration: {d}')
Interval 1 ->
Start: 2019-12-19 09:00:00
End: unknown
Duration: unknown
Interval 2 ->
Start: 2019-12-19 10:00:00
End: 2019-12-19 13:30:20
Duration: 3:30:20
Interval 3 ->
Start: 2019-12-19 14:30:00
End: unknown
Duration: unknown
Interval 4 ->
Start: 2019-12-19 15:30:00
End: 2019-12-19 17:00:35
Duration: 1:30:35
You can see that, by default, an empty property is set to ``''``.
If we pass a perfect time intervals collection
`total_duration `__
will return a very different result. Let's see an example.
>>> perfect_list_dict = [
... {'t_dt':'start','dt': '2019-12-19 10:00:00'},
... {'t_dt':'end', 'dt': '2019-12-19 13:30:20'},
... {'t_dt':'start', 'dt': '2019-12-19 14:30:00'},
... {'t_dt':'end', 'dt': '2019-12-19 17:00:35'},
... ]
>>> ti_p = TimeGaps(perfect_list_dict, 't_dt', 'start', 'end', 'dt')
>>> for t in ti_p:
... print(t.duration)
3:30:20
2:30:35
>>> ti_p.total_duration()
datetime.timedelta(seconds=21655)
>>> print(ti_p.total_duration())
6:00:55
As you can see, this method returns a timedelta_ object with the sum
of the duration of every ``TimeInterval``.
Time intervals with groups
--------------------------
In the previous example, we just got the records that need to be ordered
and put together. But what happens if we have records that belong to
different groups, all together in the same collection? Well, for that we
have the ``group_by`` parameter.
Let's try another example.
Imagine we're working with the sign-in and sign-out of the employees from the
company's web application. We should have something like this:
>>> keys_dicts = [
... {
... 'name': 'Eve', 'surname': 'Palmer',
... 't_dt':'start', 'dt': '2019-12-19 10:00:00'
... },
... {
... 'name': 'Cecilia', 'surname': 'Park',
... 't_dt':'end', 'dt': '2019-12-19 11:00:05'
... },
... {
... 'name': 'Moses', 'surname': 'Farrel',
... 't_dt':'start', 'dt': '2019-12-19 10:00:05'
... },
... {
... 'name': 'Eve', 'surname': 'Palmer',
... 't_dt':'end', 'dt': '2019-12-19 13:30:20'
... },
... {
... 'name': 'Moses', 'surname': 'Farrel',
... 't_dt':'end', 'dt': '2019-12-19 13:45:15'
... },
... {
... 'name': 'Eve', 'surname': 'Palmer',
... 't_dt':'start', 'dt': '2019-12-19 14:30:00'
... },
... {
... 'name': 'Cecilia', 'surname': 'Park',
... 't_dt':'start', 'dt': '2019-12-19 15:30:00'
... },
... {
... 'name': 'Cecilia', 'surname': 'Park',
... 't_dt':'end', 'dt': '2019-12-19 17:00:35'
... },
... {
... 'name': 'Moses', 'surname': 'Farrel',
... 't_dt':'start', 'dt': '2019-12-19 09:00:00'
... },
... {
... 'name': 'Cecilia', 'surname': 'Park',
... 't_dt':'start', 'dt': '2019-12-19 10:00:02'
... },
... ]
We cannot order these records based only on their temporary value.
If we do that, we'll be ignoring that every record belongs to a different person.
So we have to use the ``group_by`` parameter by saying which keys should use
`TimeGaps `__ to order this records. Let's see how:
For our example we need to group the records by name and surname. ``group_by``
is a keyword argumen and it's expecting a single element or a collection,
preferred a tuple. So we do it like this:
>>> ti_g = TimeGaps(
... keys_dicts, 't_dt', 'start', 'end', 'dt',
... group_by=('name', 'surname')
... )
But, additionally maybe we want to store that pairs of keys and values
of names and surnames inside of te ``TimeInterval`` objects, in order
to differentiate some intervals from others. As we said before ``group_by``
is a keyword argumen. Any other positional argumen used to instanciate the
`TimeGaps `__ class different of
``iterable``, ``tag_loc``, ``i_tag``, ``f_tag``
and ``dt_loc`` will be treated as the key for creating the additional
attributes for the ``TimeInterval`` objects of a ``TimeGaps`` iterator (this
option is not aviable if your are working with an iterable of any collection
that works with index instead of keys, like list, tuples... So if you have a
list of list or a list of tuple, your can use ``group_by`` but you can't add
additionals attributes to the ``TimeInterval`` objects). So we can change the
instanciation like this:
>>> ti_g = TimeGaps(
... keys_dicts, 't_dt', 'start', 'end', 'dt',
... 'name', 'surname',
... group_by=('name', 'surname')
... )
Now if we print every element we should see how the ``TimeInterval`` objects
has been created by groups, and how they are ordered in the collection.
>>> for i, tg in enumerate(ti_g):
... s = tg.start if tg.start!='' else 'unknown'
... e = tg.end if tg.end!='' else 'unknown'
... d = tg.duration if tg.duration!='' else 'unknown'
... emp = f'{tg.name} {tg.surname}'
... print(f'Interval {i + 1} ->')
... print(f'\t\tEmployee: {emp}')
... print(f'\t\tStart: {s}')
... print(f'\t\tEnd: {e}')
... print(f'\t\tDuration: {d}')
Interval 1 ->
Employee: Moses Farrel
Start: 2019-12-19 09:00:00
End: unknown
Duration: unknown
Interval 2 ->
Employee: Eve Palmer
Start: 2019-12-19 10:00:00
End: 2019-12-19 13:30:20
Duration: 3:30:20
Interval 3 ->
Employee: Cecilia Park
Start: 2019-12-19 10:00:02
End: 2019-12-19 11:00:05
Duration: 1:00:03
Interval 4 ->
Employee: Moses Farrel
Start: 2019-12-19 10:00:05
End: 2019-12-19 13:45:15
Duration: 3:45:10
Interval 5 ->
Employee: Eve Palmer
Start: 2019-12-19 14:30:00
End: unknown
Duration: unknown
Interval 6 ->
Employee: Cecilia Park
Start: 2019-12-19 15:30:00
End: 2019-12-19 17:00:35
Duration: 1:30:35
If we have two records that will be conform an interval add an extra
argument this will be the expected behaviour:
* if the key is present in both records with the same value, that value
will be used to populate the new attribute
* if the key is present in both records with different values, the
attribute will be populated with a tuple whose first element is the
value the start record's value for that key, and the second will be
the end record's value for that key
* if the key is present just in one of the records, its value will be
used to populate the attribute
But what happens if we want different iterators, one per element of the group?
Let’s say that we want a iterator for every employee. You can easily have it. In
fact you will get a list of ``TimeGaps`` objects, one for each employee. You
just need to call the
`grouped_intervals `__ property.
First let's see the groups that we have, by calling the
`grouper_tags `__ property.
>>> for gt in ti_g.grouper_tags:
... print(gt)
{'name': 'Cecilia', 'surname': 'Park'}
{'name': 'Eve', 'surname': 'Palmer'}
{'name': 'Moses', 'surname': 'Farrel'}
Now let's get a list of ``TimeGaps``, one per employee and see what it has
inside.
>>> grouped_ti = ti_g.grouped_intervals
>>> for group in grouped_ti:
... print('Group number:', grouped_ti.index(group) + 1)
... print('Total duration:', group.total_duration('unable'))
... for i, tg in enumerate(group):
... s = tg.start if tg.start!='' else 'unknown'
... e = tg.end if tg.end!='' else 'unknown'
... d = tg.duration if tg.duration!='' else 'unknown'
... emp = f'{tg.name} {tg.surname}'
... print(f'Interval {i + 1} ->')
... print(f'\t\tEmployee: {emp}')
... print(f'\t\tStart: {s}')
... print(f'\t\tEnd: {e}')
... print(f'\t\tDuration: {d}')
Group number: 1
Total duration: 2:30:38
Interval 1 ->
Employee: Cecilia Park
Start: 2019-12-19 10:00:02
End: 2019-12-19 11:00:05
Duration: 1:00:03
Interval 2 ->
Employee: Cecilia Park
Start: 2019-12-19 15:30:00
End: 2019-12-19 17:00:35
Duration: 1:30:35
Group number: 2
Total duration: unable
Interval 1 ->
Employee: Eve Palmer
Start: 2019-12-19 10:00:00
End: 2019-12-19 13:30:20
Duration: 3:30:20
Interval 2 ->
Employee: Eve Palmer
Start: 2019-12-19 14:30:00
End: unknown
Duration: unknown
Group number: 3
Total duration: unable
Interval 1 ->
Employee: Moses Farrel
Start: 2019-12-19 09:00:00
End: unknown
Duration: unknown
Interval 2 ->
Employee: Moses Farrel
Start: 2019-12-19 10:00:05
End: 2019-12-19 13:45:15
Duration: 3:45:10
You can easily see that a ``TimeGaps`` iterator has been created for each
employee with the same methods and properties as their ``TimeGaps`` object's
father. And that's why we could call the
`total_duration `__ method
for each ``group`` in ``grouped_ti`` collection.
Total duration anyway
---------------------
But what happens if you want to display the the duration of a group, even if it's
not perfect? Maybe you just want to dispaly it differently. Well, in those
cases you can use the
`total_duration_anyway `__
method.
Let's rework the previous example adding this new functionality.
>>> grouped_ti = ti_g.grouped_intervals
>>> # Example with total_duration_anyway() method
... for group in grouped_ti:
... print('Group number:', grouped_ti.index(group) + 1)
... # If there is a perfect duration it will be printed
... if (tot_duration := group.total_duration(False)) != False:
... print('Total duration:', tot_duration)
... # Otherwise, the imperfect duration will be displayed
... else:
... print('Not perfect duration ', group.total_duration_anyway())
... for i, tg in enumerate(group):
... s = tg.start if tg.start!='' else 'unknown'
... e = tg.end if tg.end!='' else 'unknown'
... d = tg.duration if tg.duration!='' else 'unknown'
... emp = f'{tg.name} {tg.surname}'
... print(f'Interval {i + 1} ->')
... print(f'\t\tEmployee: {emp}')
... print(f'\t\tStart: {s}')
... print(f'\t\tEnd: {e}')
... print(f'\t\tDuration: {d}')
...
Group number: 1
Total duration: 2:30:38
Interval 1 ->
Employee: Cecilia Park
Start: 2019-12-19 10:00:02
End: 2019-12-19 11:00:05
Duration: 1:00:03
Interval 2 ->
Employee: Cecilia Park
Start: 2019-12-19 15:30:00
End: 2019-12-19 17:00:35
Duration: 1:30:35
Group number: 2
Not perfect duration 3:30:20
Interval 1 ->
Employee: Eve Palmer
Start: 2019-12-19 10:00:00
End: 2019-12-19 13:30:20
Duration: 3:30:20
Interval 2 ->
Employee: Eve Palmer
Start: 2019-12-19 14:30:00
End: unknown
Duration: unknown
Group number: 3
Not perfect duration 3:45:10
Interval 1 ->
Employee: Moses Farrel
Start: 2019-12-19 09:00:00
End: unknown
Duration: unknown
Interval 2 ->
Employee: Moses Farrel
Start: 2019-12-19 10:00:05
End: 2019-12-19 13:45:15
Duration: 3:45:10
As you can see above, groups 1 and 3 have a perfect duration, and this is displayed
with the label 'Duration:'. On the other hand, group number 2 has an interval
without a valid duration (``unknown``), so Stilpy takes the remaining valid
durations, and returns a partial duration, used by our program to display the
result, labeled as 'Duration not perfect'.