trumania.core package¶
Submodules¶
trumania.core.attribute module¶
- 
class trumania.core.attribute.Attribute(population, init_values=None, init_gen=None, init_relationship=None)[source]¶
- Bases: - object- Static population attribute, with various ways to initialize it randomly - 
class AttributeOps(attribute)[source]¶
- Bases: - object- 
update(member_id_field, copy_from_field)[source]¶
- Overwrite the value of this attribute with values in this field - Parameters: - member_id_field – name of the field of the story data containing the member ids whose attribute should be updated
- copy_from_field – name of the field of the story data containing the new values of the attribute
 - Returns: 
 
- 
 - 
add(ids, added_values)[source]¶
- This only makes sense for attributes that support a + operation (e.g. numerical values or list) : this simply performs a += operation 
 
- 
class 
trumania.core.circus module¶
- 
class trumania.core.circus.Circus(name, master_seed, **clock_params)[source]¶
- Bases: - object- A Circus is just a container of a lot of objects that are required to make the simulation It is also the object that will execute the stories required for 1 iteration - 
attach_generator(gen_id, generator)[source]¶
- “attach” a random generator to this circus, s.t. it gets persisted with the rest 
 - 
create_population(name, **population_params)[source]¶
- Creates a population with the specifed parameters and attach it to this circus. 
 - 
create_story(name, **story_params)[source]¶
- Creates a story with the provided parameters and attach it to this circus. 
 - 
get_story(story_name)[source]¶
- Looks up and story by name in this circus and returns it. Returns none if not found. 
 - 
load_generator(gen_type, gen_id)[source]¶
- Load this generator definition add attach it to this circus 
 - 
load_population(population_id, namespace=None)[source]¶
- Load this population definition add attach it to this circus 
 - 
run(duration, log_output_folder, delete_existing_logs=False)[source]¶
- Executes all stories in the circus for as long as requested. - Parameters: - duration – duration of the desired simulation (start date is - dictated by the clock) :type duration: pd.TimeDelta - Parameters: - log_output_folder (string) – folder where to write the logs.
- delete_existing_logs –
 
 - 
static save_logs(log_id, logs, log_output_folder)[source]¶
- Appends those logs to the corresponding output file, creating it if it does not exist or appending lines to it otherwise. 
 
- 
trumania.core.clock module¶
- 
class trumania.core.clock.Clock(start, step_duration, seed)[source]¶
- Bases: - object- A Clock is the central object managing the evolution of time of the whole circus. It’s generating timestamps on demand, and provides information for TimeProfiler objects. - 
class ClockOps(clock)[source]¶
- Bases: - object
 - 
get_timestamp(size=1, random=True, log_format=None)[source]¶
- Returns timestamps formatted as string - Parameters: - size (int) – number of timestamps to generate, default 1
- random (boolean) – if True, the timestamps are randomly generated in [
 - self.current_date, self.current_date+self.step_duration] - Parameters: - log_format (string) – string format of the generated timestamps - Return type: - Pandas Series - Returns: - random timestamps in the form of strings 
 
- 
class 
- 
class trumania.core.clock.CyclicTimerGenerator(clock, seed, config)[source]¶
- Bases: - trumania.core.random_generators.DependentGenerator- A TimeProfiler contains an activity profile over a defined time range. It’s mostly a super class, normally only its child classes should be used. - The goal of a TimeProfiler is to keep a track of the expected level of activity of users over a cyclic time range It will store a vector with probabilities of activity per time step, as well as a cumulative sum of the probabilities starting with the current time step. - This allows to quickly produce random waiting times until the next event for the users - 
activity(n, per)[source]¶
- Parameters: - n – number of stories
- per (pd.Timedelta) – time period for that number of stories
 - Returns: - the activity level corresponding to the specified number of n - executions per time period 
 
- 
- 
class trumania.core.clock.CyclicTimerProfile(profile, profile_time_steps, start_date)[source]¶
- Bases: - object- Static parameters of the Timer profile. Separated from the timer gen itself to facilitate persistence. - Parameters: - profile (python array) – Weight of each period
- profile_time_steps (string) – duration of the time-steps in the profile
 - (e.g. “15min”) - Parameters: - start_date (pd.Timestamp) – date of the origin of the specified profile => - this is used to align with the values of the clock 
trumania.core.operations module¶
- 
class trumania.core.operations.AddColumns(join_kind='left')[source]¶
- Bases: - trumania.core.operations.Operation- Very typical case of an operation that appends (i.e. joins) columns to the previous result 
- 
class trumania.core.operations.Apply(source_fields, named_as, f, f_args='dataframe')[source]¶
- Bases: - trumania.core.operations.AddColumns- Custom operation adding one single column computed from a user-provided function. - The length of the source_fields must match the number columns in the dataframe expected by the user f function 
- 
class trumania.core.operations.Chain(*operations)[source]¶
- Bases: - trumania.core.operations.Operation- A chain is a list of operation to be executed sequencially 
- 
class trumania.core.operations.DropRow(condition_field)[source]¶
- Bases: - trumania.core.operations.Operation- Discards any row in the story data where the condition field is false. 
- 
class trumania.core.operations.FieldLogger(log_id, cols=None, exploded_cols=None)[source]¶
- Bases: - trumania.core.operations.Operation- Log creator that simply select a set of columns and create a logged dataframe from it 
- 
class trumania.core.operations.Operation[source]¶
- Bases: - object- An Operation is able to produce transform input into an output +
- produce logs.
 
- 
class trumania.core.operations.SideEffectOnly[source]¶
- Bases: - trumania.core.operations.Operation- Operation that does not produce logs nor supplementary columns: just have side effect 
- 
trumania.core.operations.bound_value(lb=None, ub=None)[source]¶
- builds a function that limits the range of a value 
- 
trumania.core.operations.bounded_sigmoid(x_min, x_max, shape, incrementing=True)[source]¶
- Builds a S-shape curve that have y values evolving between 0 and 1 over the x domain [x_min, x_max] - This is preferable to the logistic function for cases where we want to make sure that the curve actually reaches 0 and 1 at some point (e.g. probability of triggering an “restock” story must be 1 if stock is as low as 1). - See /tests/notebooks/bounded_sigmoid.ipynb for examples - Parameters: - x_min – lower bound of the x domain
- x_max – lower bound of the x domain
- incrementing – if True, evolve from 0 to 1, or from 1 to 0 otherwise
- shape – strictly positive number controlling the shape of the resulting function * 1 correspond to linear transition * higher values yield a more and more sharper, i.e. more vertical S shape, converging towards a step function transiting at (x_max-x_min)/2 for very large values of S ( e.g. 10000)- values in ]0,1[ yield vertically shaped sigmoids, sharply rising/falling at the boundary of the x domain and transiting more smoothly in the middle of it.
 
 
- 
trumania.core.operations.copy_if(story_data)[source]¶
- Copies values from the source to the “named_as” if the condition is True, otherwise inserts NA - usage: - Apply(source_fields=[“some_source_field”, “some_condition_field”],
- named_as=”some_result_field”, f=copy_if)
 
- 
trumania.core.operations.logistic(k, x0=0, L=1)[source]¶
- Returns a function, usable in an Apply operation, that transforms the specified field with a sigmoid with the provided parameters - Parameters: - k – the steepness of the curve
- x0 – the x-value of the sigmoid’s midpoint (default: 0)
- L – maximum value of the logistic (default: 1)
 - same parameter naming conventions as in: https://en.wikipedia.org/wiki/Logistic_function - usage:
- Apply(source_fields=[“some_source_field”],
- named_as=”some_result_field”, f=sigmoid(k=-0.01, x0=1000)
 
 
trumania.core.population module¶
- 
class trumania.core.population.Population(circus, ids_gen=None, size=None, ids=None)[source]¶
- Bases: - object- 
class PopulationOps(population)[source]¶
- Bases: - object- 
lookup(id_field, select)[source]¶
- Looks up some attribute values by joining on the specified field of the current data - Parameters: - id_field – field name in the story_data. If the that column contains lists, then it’s assumed to contain only list and it’s flatten to obtain the list of id to lookup in the attribute. Must be a list of “scalar” values or list of list, not a mix of both.
- select – dictionary of (attribute_name -> given_name)
 - specifying which attribute to look up and which name to give to the resulting column 
 - 
select_one(named_as)[source]¶
- Appends a field column to the story_data containing member ids taken at random among the ids of this population. - This is similar to relationship_select_one(), except that no particular relation is required, we just sample one id randomly - Parameters: - named_as – the name of the field added to the story_data 
 - 
update(id_field, copy_attributes_from_fields)[source]¶
- Adds or update members and their attributes. - Note that the index of story_data, i.e. the ids of the _triggering_ members, is irrelevant during this operation. - Parameters: - id_field – ids of the updated or created members
- copy_attributes_from_fields – - dictionary of
- (attribute name -> story data field name)
- that describes which column in the population dataframe to use
- to update which attribute.
 
 - Returns: 
 
- 
 - 
create_relationship(name, seed=None)[source]¶
- creates an empty relationship from the members of this population 
 - 
create_stock_relationship(name, item_id_gen, n_items_per_member)[source]¶
- Creates a relationship aimed at maintaining a stock, from a generator that create stock item ids. - The relationship does not point to another population, but to items whose id is generated with the provided generator. 
 - 
create_stock_relationship_grp(name, stock_bulk_gen)[source]¶
- This creates exactly the same kind of relationship as create_stock_relationship, but using a generator of list of stock items instead of a generators of items. 
 - 
get_attribute_values(attribute_name, ids=None)[source]¶
- Returns: - the values of this attribute, as a Series 
 - 
static load_from(folder, circus)[source]¶
- Reads all persistent data of this population and loads it - Parameters: - folder – folder containing all CSV files of this population
- circus – parent circus containing this population
 - Returns: 
 - 
save_to(target_folder)[source]¶
- Saves this population and all its attribute and relationships to the specified folder. - If the folder already exists, it is deleted first 
 - 
update(attribute_df)[source]¶
- Adds or updates members with the provided attribute ids and values - param attribute_df: - must be a dataframe whose index contain the id - of the inserted members. There must be as many columns as there are attributes currently defined in this population. - If the members for the specified ids already exist, their values are updated, otherwise the members are created. 
 
- 
class 
trumania.core.random_generators module¶
- 
class trumania.core.random_generators.ConstantDependentGenerator(value)[source]¶
- Bases: - trumania.core.random_generators.ConstantGenerator,- trumania.core.random_generators.DependentGenerator- Dependent generator ignoring the observations and producing a constant value. 
- 
class trumania.core.random_generators.DependentBulkGenerator(element_generator)[source]¶
- Bases: - trumania.core.random_generators.DependentGenerator- Dependent Generator that transforms that observations into a list of observation elements that are generated through element_generator. 
- 
class trumania.core.random_generators.DependentGenerator[source]¶
- Bases: - object- Generator providing random values depending on some live observation among the fields of the story or attributes of the populations. - This opens the door to “probability given” distributions - 
class DependentGeneratorOps(generator)[source]¶
- Bases: - object- 
class RandomValuesFromField(generator, named_as, observations_field)[source]¶
- Bases: - trumania.core.operations.AddColumns- Operation that produces one single column generated randomly. 
 
- 
class 
 
- 
class 
- 
class trumania.core.random_generators.DependentTrigger(value_to_proba_mapper=<function identity>, seed=None)[source]¶
- Bases: - object- A trigger is a boolean Generator. - A dependent trigger transforms, with the specified function, the value of the depended on story field or population attribute into the [0,1] range and uses that as the probability of triggering (i.e. of returning True) 
- 
class trumania.core.random_generators.DependentTriggerGenerator(value_to_proba_mapper=<function identity>, seed=None)[source]¶
- Bases: - trumania.core.random_generators.DependentTrigger,- trumania.core.random_generators.DependentGenerator- Composition of the two mixin above:
- DependentGenerator: , with the ability to build operation that generate
- random values
 - DependentTrigger: to specify that the the generation actually produces booleans with a value_mapper 
 
- 
class trumania.core.random_generators.FakerGenerator(seed, method, **fakerKwargs)[source]¶
- Bases: - trumania.core.random_generators.Generator- Generator wrapping Faker factory 
- 
class trumania.core.random_generators.Generator[source]¶
- Bases: - object- Independent parameterized random value generator. Abstract class - 
class GeneratorOps(generator)[source]¶
- Bases: - object- 
class RandomValues(generator, named_as, quantity_field)[source]¶
- Bases: - trumania.core.operations.AddColumns- Operation that produces one single column generated randomly. 
 
- 
class 
 - 
file_loaders= {'NumpyRandomGenerator': <function NumpyRandomGenerator.load_from at 0x1148b6b70>, 'SequencialGenerator': <function SequencialGenerator.load_from at 0x1148b6f28>}¶
 - 
flatmap(dependent_generator)[source]¶
- Not _really_ a flatmap but close enough on concept (I guess): this chains self with a DependentGenerator by feeding our output values as observations to the DependentGenerator at the moment of generation. - Parameters: - dependent_generator – must be an instance of DependentGenerator, i.e. have a .generate(observations=…) method - Returns: - an instance of Generator whose .generate(size=…) method provides the combination of the above 
 - 
generate(size)[source]¶
- “Independent” random value generation: do not depend on any previous observation, we just want to sample the random variable size times - Parameters: - size – the number of random value to produce - Returns: - an array of generated random values 
 
- 
class 
- 
class trumania.core.random_generators.MSISDNGenerator(countrycode, prefix_list, length, seed=None)[source]¶
- 
class trumania.core.random_generators.MongoIdGenerator[source]¶
- Bases: - trumania.core.random_generators.Generator- Generates a random ObjectId for MongoDB, from bson.objectid.ObjectID, See http://api.mongodb.com/python/current/api/bson/objectid.html 
- 
class trumania.core.random_generators.NumpyRandomGenerator(method, seed, **numpy_parameters)[source]¶
- Bases: - trumania.core.random_generators.Generator- Generator wrapping any numpy.Random method. 
- 
class trumania.core.random_generators.ParetoGenerator(xmin, seed=None, force_int=False, **np_params)[source]¶
- Bases: - trumania.core.random_generators.Generator- Builds a pareto having xmin as lower bound for the sampled values and a
- as power parameter, i.e.: - p(x|a) = (x/xmin)^a if x >= xmin
- = 0 otherwise
 - The higher the value of a, the closer pareto gets to dirac’s delta. 
- force_int allows to round each value to integers (handy to generate
- counts distributed as a power law)
 
- 
class trumania.core.random_generators.SequencialGenerator(start=0, prefix='id_', max_length=10)[source]¶
- Bases: - trumania.core.random_generators.Generator- Generator of sequencial unique values 
trumania.core.relationship module¶
- 
class trumania.core.relationship.Relations(to_ids, weights)[source]¶
- Bases: - object- This entity contains all the “to” sides of the relationships of a given “from”, together with the related weights. - This data structure seems to be the most optimal since it corresponds to a cached group-by result, and those group-by are expensive in the select_one operation - 
static from_tuples(from_ids, to_ids, weights)[source]¶
- from_ids, to_ids and weights must be 3 arrays of identical size, a relationship is built here for each “line” read across those 3 arrays. - This methods builds one instance of Relations for each unique from_id value, containing all the to_id’s it is related to. 
 - 
minus(other)[source]¶
- removes from self _all_ relations to the to_ids mentioned in the provided other Relation 
 - 
pick_many(random_state, amount)[source]¶
- Quantities and req_indices should have the same size: the first one lists the ids of the index that requests some selections to be picked in the relationship, and the second provide the quantities that each request asked. - The result will be in vertical format, with as many lines as the sum of the quantities. 
 
- 
static 
- 
class trumania.core.relationship.Relationship(seed)[source]¶
- Bases: - object- 
class RelationshipOps(relationship)[source]¶
- Bases: - object- 
class SelectOne(relationship, from_field, named_as, one_to_one, pop, discard_missing, weight)[source]¶
 - 
add_grouped(from_field, grouped_items_field)[source]¶
- this is similar to add, execept that the “to” field should here contain lists of “to” values instead of single ones 
 - 
select_all(from_field, named_as)[source]¶
- This simply creates a new story_data field containing all the “to” values of the requested from, as a set. 
 - 
select_one(from_field, named_as, one_to_one=False, pop=False, discard_empty=False, weight=None)[source]¶
- Parameters: - from_field – field corresponding to the “from” side of the relationship
- named_as – field name assigned to the selected “to” side of the relationship
- one_to_one – boolean indicating that any “to” value will be selected at most once
- pop – if True, the selected relation is removed
- discard_empty – if False, any non-existing “from” in the relationship yields a None in the resulting selection. If true, that row is removed from the story_data.
- weight – weight to use for the “to” side of the relationship. Must be a Series whose index are the “to” values. Typical usage would be to plug an attribute of the “to” population here.
 - Returns: - this operation adds a single column corresponding to a random choice from a Relationship 
 
- 
class 
 - 
add_grouped_relations(from_ids, grouped_ids)[source]¶
- Add “bulk” relationship, i.e. many “to” sides for each “from” side at once. - Parameters: - from_ids – list of “from” sides of the relationships to add
- grouped_ids – list of list of “to” sides of the relationships to add
 - Note: we assume all weights are 1 for this use (for now 
 - 
add_relations(from_ids, to_ids, weights=1)[source]¶
- Add relations to this Relationships from from_ids, to_ids, weights 
 - 
get_neighbourhood_size(from_ids)[source]¶
- return a series indexed by “from” containing the number of “tos” for each requested from. 
 - 
get_relations(from_ids=None)[source]¶
- This returns, as a dataframe, the sub-set of the relationships whose “from” is part of specified “from_ids”. - If no from_ids is provided, this just returns all the relations. 
 - 
remove_relations(from_ids, to_ids)[source]¶
- Removes all relations between those from_ids and to_ids pairs (not combinatory: if each list is 10 elements, we removed 10 pairs). If the same relation was stored several times between two ids, this removes them all 
 - 
save_to(file_path)[source]¶
- Saves all the relationship as well as the current status of the seed as a CSV file 
 - 
select_all_horizontal(from_ids, named_as='to')[source]¶
- Return all the “to” sides starting from each “from”, as an “horizontal” list, i.e. each “from” is on one row and the set of all “to” are all on that row, in one list. - Any requested from_id that has no relationship is absent is the returned dataframe (=> the corresponding rows are dropped in the result) 
 - 
select_many(from_ids, named_as, quantities, remove_selected=False, discard_empty=True)[source]¶
- The result is returned in vertical format and index by the values of the index of from_ids. Since we select several values, we return several lines per index value of from_id => during the subsequent join by the Operation, the number of produced rows increases. 
 - 
select_one(from_ids=None, named_as='to', remove_selected=False, discard_empty=True, one_to_one=False, overridden_to_weights=None)[source]¶
- Randomly selects one “to” part for each specified id in from_ids. An id can be specified several times in that list, in which case we simply do a selection several times. The result is aligned with from_ids by index. i.e. the row in the return value that has the same pandas index than a rom in from_ids is the selection for that row. - The selection in the resulting dataframe will by default be named “to”, unless this is overridden by “named_as”. - If remove_selected is True, the selected relations are removed from the relationship. This is handy to model stocks or any container of things. - If discard_empty is True, all specified from_ids will be present in the result, even if no relation is available for them or if some selection were dropped due to one-to-one config. - If one_to_one is True, the selection is an injective function, i.e each to_ids will at most be picked once. - overridden_to_weights is an optional dictionary of {“to”: weight} that can be used to override the default weights contained in this Relationship. 
 
- 
class 
trumania.core.story module¶
- 
class trumania.core.story.Story(name, initiating_population, member_id_field, activity_gen=<trumania.core.random_generators.ConstantGenerator object>, states=None, timer_gen=<trumania.core.random_generators.ConstantDependentGenerator object>, auto_reset_timer=True)[source]¶
- Bases: - object- 
class StoryOps(story)[source]¶
- Bases: - object- 
force_act_next(member_id_field, condition_field=None)[source]¶
- Sets the timer of those members to 0, forcing them to act at the next clock tick 
 
- 
 - 
active_inactive_ids()[source]¶
- Returns: - 2 sets of member ids: the one active at this turn and the - others 
 - 
get_param(param_name, ids)[source]¶
- Parameters: - param_name – either “activity” or “”back_to_default_probability”“
- ids – population member ids
 - Returns: - the activity level of each requested member id, depending its - current state 
 - 
static init_story_data(member_id_field_name, active_ids)[source]¶
- creates the initial story_data dataframe containing just the id of the currently active members 
 - 
reset_timers(ids=None)[source]¶
- Resets the timers to some random positive number of ticks, related to the activity level of each population row. - We limit to a set of ids and not all the members currently set to zero, since we could have zero timers as a side effect of other storys, in which case we want to trigger an execution at next clock tick instead of resetting the timer. - Parameters: - ids – the subset of population member ids to impact 
 
- 
class 
trumania.core.util_functions module¶
Collection of utility functions
- 
trumania.core.util_functions.build_ids(size, id_start=0, prefix='id_', max_length=10)[source]¶
- builds a sequencial list of string ids of specified size 
- 
trumania.core.util_functions.cap_to_total(values, target_total)[source]¶
- return a copy of values with the largest values possible s.t.:
- all return values are <= the original ones
- their sum is == total
 
 
- 
trumania.core.util_functions.ensure_non_existing_dir(folder)[source]¶
- makes sure the specified directory does not exist, potentially deleting any file or folder it contains 
- 
trumania.core.util_functions.latest_date_before(starting_date, upper_bound, time_step)[source]¶
- Looks for the latest result_date s.t result_date = starting_date + n * time_step for any integer n result_date <= upper_bound- Returns: - pd.Timestamp 
- 
trumania.core.util_functions.load_all_logs(folder)[source]¶
- loads all csv file contained in this folder and retun them as one dictionary where the key is the filename without the extension 
- 
trumania.core.util_functions.make_random_assign(set1, set2, seed)[source]¶
- Assign randomly a member of set2 to each member of set1 :return: a dataframe with as many rows as set1 
- 
trumania.core.util_functions.make_random_bipartite_data(group1, group2, p, seed)[source]¶
- Parameters: - group1 (list) – Ids of first group
- group2 (list) – Ids of second group
- p (float) – probability of existence of 1 edge
- seed (int) – seed for random generator
 - Return type: - list - Returns: - all edges in the graph 
- 
trumania.core.util_functions.merge_2_dicts(dict1, dict2, value_merge_func=None)[source]¶
- Parameters: - dict1 – first dictionary to be merged
- dict2 – first dictionary to be merged
- value_merge_func – specifies how to merge 2 values if present in
 - both dictionaries :type value_merge_func: function (value1, value) => value :return: