Skip to content

types

ARTIFACT_STORAGE_BACKEND

Bases: str, Enum

Artifact storage backend

Source code in lineapy/data/types.py
546
547
548
549
550
551
552
class ARTIFACT_STORAGE_BACKEND(str, Enum):
    """
    Artifact storage backend
    """

    lineapy = "lineapy"
    mlflow = "mlflow"

Artifact

Bases: BaseModel

An artifact points to the value of a node during some execution.

Source code in lineapy/data/types.py
128
129
130
131
132
133
134
135
136
137
138
139
140
141
class Artifact(BaseModel):
    """
    An artifact points to the value of a node during some execution.
    """

    node_id: LineaID
    execution_id: LineaID

    date_created: datetime.datetime
    name: str
    version: int

    class Config:
        orm_mode = True

ArtifactInfo

Bases: TypedDict

Artifact backend storage metadata

Attributes:

Name Type Description
lineapy LineaArtifactInfo

storage backend for LineaPy

mlflow NotRequired[MLflowArtifactInfo]

storage backend metadata for MLflow (only exists when the artifact is saved in MLflow)

Source code in lineapy/data/types.py
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
class ArtifactInfo(TypedDict):
    """
    Artifact backend storage metadata

    Attributes
    ----------
    lineapy: LineaArtifactInfo
        storage backend for LineaPy
    mlflow: NotRequired[MLflowArtifactInfo]
        storage backend metadata for MLflow (only exists when the
        artifact is saved in MLflow)
    """

    lineapy: LineaArtifactInfo
    mlflow: NotRequired[MLflowArtifactInfo]

AssignedVariable

For local variables, this is the node that is assigned to.

Source code in lineapy/data/types.py
506
507
508
509
510
511
512
class AssignedVariable:
    """
    For local variables, this is the node that is assigned to.
    """

    node_id: LineaID
    assigned_variable: str

BaseNode

Bases: BaseModel

Attributes:

Name Type Description
id str

string version of UUID, which we chose because we do not need to coordinate to make it unique

lineno int

Record the position of the calls. Optional because it is not required by some nodes, such as side-effects (which do not correspond to a line of code).

col_offset int

Record the position of the calls. Optional because it is not required by some nodes, such as side-effects (which do not correspond to a line of code).

end_lino int

Record the position of the calls. Optional because it is not required by some nodes, such as side-effects (which do not correspond to a line of code).

end_col_offsets int

Record the position of the calls. Optional because it is not required by some nodes, such as side-effects (which do not correspond to a line of code).

control_dependency Optional[LineaID]

points to a ControlFlowNode which the generation of the current node is dependent upon. For example, in the snippet if condition: l.append(0), the append instruction's execution depends on the condition being true or not, hence the MutateNode corresponding to the append instruction will have it's control_dependency field pointing to the IfNode of the condition. Refer to tracer.py for usage.

class Config's orm_mode allows us to use from_orm to convert ORM objects to pydantic objects

Source code in lineapy/data/types.py
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
class BaseNode(BaseModel):
    """
    Attributes
    ----------
    id: str
        string version of UUID, which we chose because
        we do not need to coordinate to make it unique
    lineno: int
        Record the position of the calls. Optional because it is not required by some nodes,
        such as side-effects (which do not correspond to a line of code).
    col_offset: int
        Record the position of the calls. Optional because it is not required by some nodes,
        such as side-effects (which do not correspond to a line of code).
    end_lino: int
        Record the position of the calls. Optional because it is not required by some nodes,
        such as side-effects (which do not correspond to a line of code).
    end_col_offsets: int
        Record the position of the calls. Optional because it is not required by some nodes,
        such as side-effects (which do not correspond to a line of code).
    control_dependency: Optional[LineaID]
        points to a ControlFlowNode which the generation of
        the current node is dependent upon. For example, in the snippet
        `if condition: l.append(0)`, the `append` instruction's execution depends
        on the condition being true or not, hence the MutateNode corresponding to
        the append instruction will have it's control_dependency field pointing
        to the IfNode of the condition. Refer to tracer.py for usage.

    `class Config`'s orm_mode allows us to use from_orm to convert ORM
    objects to pydantic objects
    """

    id: LineaID
    session_id: LineaID = Field(repr=False)  # refers to SessionContext.id
    node_type: NodeType = Field(NodeType.Node, repr=False)
    source_location: Optional[SourceLocation] = Field(repr=False)
    control_dependency: Optional[LineaID]

    class Config:
        orm_mode = True

    def __lt__(self, other: object) -> bool:
        """
        Sort nodes by line number and column, putting those without line numbers
        at the beginning.

        Used to break ties in topological node ordering.
        """
        if not isinstance(other, BaseNode):
            return NotImplemented

        if not other.source_location:
            return False
        if not self.source_location:
            return True

        return self.source_location < other.source_location

    def parents(self) -> Iterable[LineaID]:
        """
        Returns the parents of this node.
        """
        # Return control dependencies, which could exist for any node
        if self.control_dependency:
            yield self.control_dependency

__lt__(other)

Sort nodes by line number and column, putting those without line numbers at the beginning.

Used to break ties in topological node ordering.

Source code in lineapy/data/types.py
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
def __lt__(self, other: object) -> bool:
    """
    Sort nodes by line number and column, putting those without line numbers
    at the beginning.

    Used to break ties in topological node ordering.
    """
    if not isinstance(other, BaseNode):
        return NotImplemented

    if not other.source_location:
        return False
    if not self.source_location:
        return True

    return self.source_location < other.source_location

parents()

Returns the parents of this node.

Source code in lineapy/data/types.py
307
308
309
310
311
312
313
def parents(self) -> Iterable[LineaID]:
    """
    Returns the parents of this node.
    """
    # Return control dependencies, which could exist for any node
    if self.control_dependency:
        yield self.control_dependency

CallNode

Bases: BaseNode

Attributes:

Name Type Description
function_id LineaID

node containing the value of the function call, which could be from various places: (1) locally defined, (2) imported, and (3) magically existing, e.g. from builtins (min), or environment like get_ipython.

value Object

value of the call result, filled at runtime. It may be cached by the data asset manager

Source code in lineapy/data/types.py
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
class CallNode(BaseNode):
    """
    Attributes
    ----------
    function_id: LineaID
        node containing the value of the function call, which
        could be from various places: (1) locally defined, (2) imported, and
        (3) magically existing, e.g. from builtins (`min`), or environment
        like `get_ipython`.
    value: Object
        value of the call result, filled at runtime. It may be cached
        by the data asset manager
    """

    node_type: NodeType = Field(NodeType.CallNode, repr=False)

    function_id: LineaID
    positional_args: List[PositionalArgument] = []
    keyword_args: List[KeywordArgument] = []

    # Mapping of global variables that need to be set to call this function
    global_reads: Dict[str, LineaID] = {}

    # TODO: add documentation
    implicit_dependencies: List[LineaID] = []

    def parents(self) -> Iterable[LineaID]:
        yield from super().parents()
        yield self.function_id
        yield from [node.id for node in self.positional_args]
        yield from [node.value for node in self.keyword_args]
        yield from self.global_reads.values()
        yield from self.implicit_dependencies

ControlFlowNode

Bases: BaseNode

Represents a control flow node like if, else, for, while

Source code in lineapy/data/types.py
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
class ControlFlowNode(BaseNode):
    """
    Represents a control flow node like `if`, `else`, `for`, `while`
    """

    node_type = Field(NodeType.Node, repr=False)

    # Points to the attached node
    # For `if` node, it will be an `else` node, for an `else` node it could be an `if` node, `while` node etc.
    companion_id: Optional[LineaID]

    # LiteralNode containing the code, if the block is not executed
    unexec_id: Optional[LineaID]

    def parents(self) -> Iterable[LineaID]:
        yield from super().parents()
        if self.companion_id:
            yield self.companion_id
        if self.unexec_id:
            yield self.unexec_id

ElseNode

Bases: ControlFlowNode

Represents the else keyword

Source code in lineapy/data/types.py
476
477
478
479
480
481
482
483
484
485
486
487
class ElseNode(ControlFlowNode):
    """
    Represents the `else` keyword
    """

    node_type = Field(NodeType.ElseNode, repr=False)

    # Points to the attached node
    # Could be a node corresponding to `if`, `for`, `while`, etc.
    # The definition here is used only for typing purposes, it will
    # automatically be included in super().parents()
    companion_id: LineaID

Execution

Bases: BaseModel

An execution is one session of running many nodes and recording their values.

Source code in lineapy/data/types.py
116
117
118
119
120
121
122
123
124
125
class Execution(BaseModel):
    """
    An execution is one session of running many nodes and recording their values.
    """

    id: LineaID
    timestamp: Optional[datetime.datetime]

    class Config:
        orm_mode = True

GlobalNode

Bases: BaseNode

Represents a lookup of a global variable, that was set as a side effect in another node.

Source code in lineapy/data/types.py
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
class GlobalNode(BaseNode):
    """
    Represents a lookup of a global variable, that was set as a side effect
    in another node.
    """

    node_type = Field(NodeType.GlobalNode, repr=False)

    # The name of the variable to look up from the result of the call
    name: str

    # Points to the call node that updated the global
    call_id: LineaID

    def parents(self) -> Iterable[LineaID]:
        yield from super().parents()
        yield self.call_id

IfNode

Bases: ControlFlowNode

Represents the if keyword

Source code in lineapy/data/types.py
461
462
463
464
465
466
467
468
469
470
471
472
473
class IfNode(ControlFlowNode):
    """
    Represents the `if` keyword
    """

    node_type = Field(NodeType.IfNode, repr=False)

    # Points to the call node which forms the expression to test
    test_id: LineaID

    def parents(self) -> Iterable[LineaID]:
        yield from super().parents()
        yield self.test_id

ImportNode

Bases: BaseNode

Imported libraries.

version and package_name are retrieved at runtime. package_name may be different from import name, see get_lib_package_version.

These are optional because the info is acquired at runtime.

Note

This node is not actually used for execution (using l_import CallNodes), but more a decoration for metadata retrieval.

Source code in lineapy/data/types.py
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
class ImportNode(BaseNode):
    """
    Imported libraries.

    `version` and `package_name` are retrieved at runtime.
    `package_name` may be different from import name, see get_lib_package_version.

    These are optional because the info is acquired at runtime.

    ??? note

        This node is not actually used for execution (using `l_import` CallNodes),
        but more a decoration for metadata retrieval.
    """

    node_type: NodeType = NodeType.ImportNode
    name: str
    version: Optional[str] = None
    package_name: Optional[str] = None
    path: Optional[str] = None

LineaArtifactDef

Bases: TypedDict

Definition of an artifact, can extend new keys(user, project, ...) in the future.

Source code in lineapy/data/types.py
555
556
557
558
559
560
561
562
class LineaArtifactDef(TypedDict):
    """
    Definition of an artifact, can extend new keys(user, project, ...)
    in the future.
    """

    artifact_name: str
    version: NotRequired[Optional[int]]

LineaArtifactInfo dataclass

Backend storage metadata for LineaPy

Source code in lineapy/data/types.py
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
@dataclass
class LineaArtifactInfo:
    """
    Backend storage metadata for LineaPy
    """

    artifact_id: int
    name: str
    version: int
    execution_id: LineaID
    session_id: LineaID
    node_id: LineaID
    date_created: datetime.datetime
    storage_path: str
    storage_backend: ARTIFACT_STORAGE_BACKEND

LookupNode

Bases: BaseNode

For unknown/undefined variables e.g. SQLcontext, get_ipython, int.

Source code in lineapy/data/types.py
389
390
391
392
393
394
395
class LookupNode(BaseNode):
    """
    For unknown/undefined variables e.g. SQLcontext, get_ipython, int.
    """

    node_type = Field(NodeType.LookupNode, repr=False)
    name: str

MLflowArtifactInfo dataclass

Backend storage metadata for MLflow

Source code in lineapy/data/types.py
582
583
584
585
586
587
588
589
590
591
592
593
@dataclass
class MLflowArtifactInfo:
    """
    Backend storage metadata for MLflow
    """

    id: int
    artifact_id: int
    tracking_uri: str
    registry_uri: Optional[str]
    model_uri: str
    model_flavor: str

MutateNode

Bases: BaseNode

Represents a mutation of a node's value.

After a call mutates a node then later references to that node will instead refer to this mutate node.

Source code in lineapy/data/types.py
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
class MutateNode(BaseNode):
    """
    Represents a mutation of a node's value.

    After a call mutates a node then later references to that node will
    instead refer to this mutate node.
    """

    node_type = Field(NodeType.MutateNode, repr=False)

    # Points to the original node that was mutated
    source_id: LineaID

    # Points to the CallNode that did the mutation
    call_id: LineaID

    def parents(self) -> Iterable[LineaID]:
        yield from super().parents()
        yield self.source_id
        yield self.call_id

PipelineType

Bases: Enum

Pipeline types allow the to_pipeline to know what to expect

Attributes:

Name Type Description
SCRIPT int

the pipeline is wrapped as a python script

AIRFLOW int

the pipeline is wrapped as an airflow dag

DVC int

the pipeline is wrapped as a DVC

ARGO int

the pipeline is wrapped as an Argo workflow dag

KUBEFLOW int

the pipeline is defined using Kubeflow's python SDK

RAY int

the pipeline is wrapped as a Ray DAG

Source code in lineapy/data/types.py
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
class PipelineType(Enum):
    """
    Pipeline types allow the `to_pipeline` to know what to expect

    Attributes
    ----------
    SCRIPT: int
        the pipeline is wrapped as a python script
    AIRFLOW: int
        the pipeline is wrapped as an airflow dag
    DVC: int
        the pipeline is wrapped as a DVC
    ARGO: int
        the pipeline is wrapped as an Argo workflow dag
    KUBEFLOW: int
        the pipeline is defined using Kubeflow's python SDK
    RAY: int
        the pipeline is wrapped as a Ray DAG
    """

    SCRIPT = 1
    AIRFLOW = 2
    DVC = 3
    ARGO = 4
    KUBEFLOW = 5
    RAY = 6

SessionContext

Bases: BaseModel

Each trace of a script/notebook is a "Session".

:param working_directory: captures where the code ran by the user

Source code in lineapy/data/types.py
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
class SessionContext(BaseModel):
    """
    Each trace of a script/notebook is a "Session".

    :param working_directory: captures where the code ran by the user

    """

    id: LineaID  # populated on creation by uuid.uuid4()
    environment_type: SessionType
    python_version: str
    creation_time: datetime.datetime
    working_directory: str  # must be passed in for now
    session_name: Optional[str] = None
    user_name: Optional[str] = None
    # The ID of the corresponding execution
    execution_id: LineaID

    class Config:
        orm_mode = True

SessionType

Bases: Enum

Session types allow the tracer to know what to expect - JUPYTER: the tracer need to progressively add more nodes to the graph - SCRIPT: the easiest case, run everything until the end

Source code in lineapy/data/types.py
13
14
15
16
17
18
19
20
21
class SessionType(Enum):
    """
    Session types allow the tracer to know what to expect
    - JUPYTER: the tracer need to progressively add more nodes to the graph
    - SCRIPT: the easiest case, run everything until the end
    """

    JUPYTER = 1
    SCRIPT = 2

SourceCode

Bases: BaseModel

The source code of the code that was executed.

Source code in lineapy/data/types.py
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
class SourceCode(BaseModel):
    """
    The source code of the code that was executed.
    """

    id: LineaID
    code: str
    location: SourceCodeLocation

    class Config:
        orm_mode = True

    def __hash__(self) -> int:
        return hash((self.id))

    def __eq__(self, other: object) -> bool:
        if not isinstance(other, SourceCode):
            return NotImplemented
        return self.id == other.id

    def __lt__(self, other: object) -> bool:
        """
        Returns true if the this source code comes before the other, only applies
        to Jupyter sources.

        It will return not implemented, if they are not from the same file
        or the same Jupyter session.
        """
        if not isinstance(other, SourceCode):
            return NotImplemented
        self_location = self.location
        other_location = other.location
        if isinstance(self_location, Path) and isinstance(
            other_location, Path
        ):
            # If they are of different files, we can't compare them
            if self_location != other_location:
                return NotImplemented
            # Otherwise, they are equal so not lt
            return False
        elif isinstance(self_location, JupyterCell) and isinstance(
            other_location, JupyterCell
        ):
            # If they are from different sessions, we cant compare them.
            if self_location.session_id != other_location.session_id:
                return NotImplemented
            # Compare jupyter cells first by execution count, then line number
            return (self_location.execution_count) < (
                other_location.execution_count
            )
        # If they are different source locations, we don't know how to compare
        assert type(self_location) == type(other_location)
        return NotImplemented

__lt__(other)

Returns true if the this source code comes before the other, only applies to Jupyter sources.

It will return not implemented, if they are not from the same file or the same Jupyter session.

Source code in lineapy/data/types.py
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
def __lt__(self, other: object) -> bool:
    """
    Returns true if the this source code comes before the other, only applies
    to Jupyter sources.

    It will return not implemented, if they are not from the same file
    or the same Jupyter session.
    """
    if not isinstance(other, SourceCode):
        return NotImplemented
    self_location = self.location
    other_location = other.location
    if isinstance(self_location, Path) and isinstance(
        other_location, Path
    ):
        # If they are of different files, we can't compare them
        if self_location != other_location:
            return NotImplemented
        # Otherwise, they are equal so not lt
        return False
    elif isinstance(self_location, JupyterCell) and isinstance(
        other_location, JupyterCell
    ):
        # If they are from different sessions, we cant compare them.
        if self_location.session_id != other_location.session_id:
            return NotImplemented
        # Compare jupyter cells first by execution count, then line number
        return (self_location.execution_count) < (
            other_location.execution_count
        )
    # If they are different source locations, we don't know how to compare
    assert type(self_location) == type(other_location)
    return NotImplemented

SourceLocation

Bases: BaseModel

The location of the original source.

eventually we need to also be able to support fused locations, like MLIR: https://mlir.llvm.org/docs/Dialects/Builtin/#location-attributes but for now we just point at the original user source location.

Source code in lineapy/data/types.py
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
class SourceLocation(BaseModel):
    """
    The location of the original source.

    eventually we need to also be able to support fused locations, like MLIR:
    https://mlir.llvm.org/docs/Dialects/Builtin/#location-attributes
    but for now we just point at the original user source location.
    """

    lineno: int
    col_offset: int = Field(repr=False)
    end_lineno: int = Field(repr=False)
    end_col_offset: int = Field(repr=False)
    source_code: SourceCode = Field(repr=False)

    def __lt__(self, other: object) -> bool:
        """
        Returns true if the this source location comes before the other.

        It will return not implemented, if they are not from the same file
        or the same Jupyter session.
        """
        if not isinstance(other, SourceLocation):
            return NotImplemented
        source_code_lt = self.source_code < other.source_code
        # If they are different source locations, we don't know how to compare
        if source_code_lt == NotImplemented:
            return NotImplemented
        # Otherwise, if they are from the same source, compare by line number
        if self.source_code.location == other.source_code.location:
            return (self.lineno, self.col_offset) < (
                other.lineno,
                other.col_offset,
            )
        return source_code_lt

    class Config:
        orm_mode = True

__lt__(other)

Returns true if the this source location comes before the other.

It will return not implemented, if they are not from the same file or the same Jupyter session.

Source code in lineapy/data/types.py
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
def __lt__(self, other: object) -> bool:
    """
    Returns true if the this source location comes before the other.

    It will return not implemented, if they are not from the same file
    or the same Jupyter session.
    """
    if not isinstance(other, SourceLocation):
        return NotImplemented
    source_code_lt = self.source_code < other.source_code
    # If they are different source locations, we don't know how to compare
    if source_code_lt == NotImplemented:
        return NotImplemented
    # Otherwise, if they are from the same source, compare by line number
    if self.source_code.location == other.source_code.location:
        return (self.lineno, self.col_offset) < (
            other.lineno,
            other.col_offset,
        )
    return source_code_lt

ValueType

Bases: Enum

Lower case because the API with the frontend assume the characters "chart" exactly as is.

Source code in lineapy/data/types.py
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
class ValueType(Enum):
    """
    Lower case because the API with the frontend assume the characters "chart"
    exactly as is.
    """

    # [TODO] Rename (need coordination with linea-server):
    # - `dataset` really is a table
    # - `value` means its a literal  (e.g., int/str)

    chart = 1
    array = 2
    dataset = 3
    code = 4
    value = 5  # includes int, string, bool

Was this helpful?

Help us improve docs with your feedback!