Skip to main content

DepAtlasSource Internals

DepAtlasSource is the canonical source-first engine in DependencyAtlas. It owns:

  • stable file/module/method/type identity
  • the first layer of call evidence
  • Julia-specific structure semantics such as include, import, overload families, and generated declarations

It is intentionally not a monolithic scanner anymore. The current design is a staged backend with narrow file boundaries so we can keep adding Julia-specific behavior without turning the engine back into one large script.

Current Phase Layout

The engine is now easiest to understand as six consecutive phases:

  1. Syntax scan
  2. Lowering
  3. Pipeline orchestration
  4. Structure apply
  5. Call apply
  6. Dispatch narrowing and graph annotation

Those phases are implemented across these files:

  • src/da_source/syntax_scan.jl
  • src/da_source/lowering_shared.jl
  • src/da_source/structure_lowering.jl
  • src/da_source/binding_timeline.jl
  • src/da_source/macro_templates.jl
  • src/da_source/call_lowering.jl
  • src/da_source/pipeline_orchestration.jl
  • src/da_source/structure_apply.jl
  • src/da_source/call_apply.jl
  • src/da_source/constructor_families.jl
  • src/da_source/dispatch_types.jl
  • src/da_source/dispatch.jl
  • src/da_source/scan.jl

1. Syntax Scan

syntax_scan.jl is the JuliaSyntax frontend.

Its job is to emit raw DASourceSyntaxFact records that stay close to source structure:

  • modules
  • imports and usings
  • includes
  • method and type definitions
  • body calls
  • local bindings
  • return values
  • explicit macro calls
  • project-local macro-generated quoted calls

Important boundaries:

  • it may classify syntax and attach raw payload facts
  • it must not build graph nodes or edges
  • it must not depend on the later graph indexes

This file also contains the transparent macro wrapper rules for source-level call extraction. Those rules are syntax-layer policy, not graph-layer policy.

2. Lowering

Lowering is now split instead of living in one file.

Shared lowered IR

lowering_shared.jl defines the common lowered structs used by later phases:

  • definition spans
  • call-site facts
  • local binding summaries
  • argument-shape summaries
  • return summaries

This is the main seam between raw syntax facts and the more stable ownership-aware pipeline.

Structure lowering

structure_lowering.jl turns syntax facts into structure events:

  • file discovered
  • module defined
  • include
  • module import
  • method/type/generic definition

Binding timeline

binding_timeline.jl computes owner-line snapshots for:

  • local module-valued bindings
  • local value bindings
  • method return summaries

The important current rule is that timeline logic should be shared. We now use one generic owner-line snapshot resolver instead of keeping separate mirrored implementations for module bindings and value bindings.

Macro templates

macro_templates.jl handles project-local macro-generated declarations and calls:

  • template methods
  • template types
  • generated method instantiation
  • generated call-site expansion

This layer is deliberately source-first. It scans project-local quoted/generated code without doing general-purpose macro expansion.

Call lowering

call_lowering.jl turns body-call syntax facts into ownership-aware call events.

This is where lowered call facts get:

  • owner method identity
  • local module binding snapshots
  • local value binding snapshots
  • call origin tagging such as source_explicit vs macro_generated

3. Pipeline Orchestration

pipeline_orchestration.jl is the source-engine frontend orchestrator.

Its job is to:

  • read each file once
  • cache both source text and lines
  • run syntax scan
  • run structure lowering first
  • collect macro templates
  • run call lowering second

This file should stay boring. If it starts accumulating semantic policy again, the right fix is usually to move code back into lowering or apply layers.

4. Structure Apply

structure_apply.jl applies structure-side facts to the canonical graph:

  • file nodes
  • module nodes
  • include/import edges
  • method/type/generic nodes
  • generated declaration metadata
  • dispatch signature metadata

This is the stage that defines canonical graph identity.

5. Constructor Families

constructor_families.jl is intentionally separate from general structure apply.

It annotates:

  • constructor methods
  • constructor ownership by type
  • implicit default constructors
  • constructor family metadata stored on type nodes

This logic is graph-normalization policy, not syntax parsing policy.

6. Call Apply

call_apply.jl applies call facts after structure identity exists.

It is responsible for:

  • constructor-call edges
  • project-local method targets
  • qualified/internal vs external resolution
  • external fallback nodes
  • call evidence assembly
  • dispatch-based narrowing hooks

This file should own call-to-graph policy. It should not grow raw syntax walking logic.

7. Dispatch Narrowing

dispatch_types.jl and dispatch.jl form the lightweight static dispatch layer.

This layer is intentionally small. Its job is not to become a whole semantic engine; it only narrows already-plausible local candidates using:

  • arity
  • local value flow
  • owner parameter environments
  • type hierarchy metadata
  • field access and type-object hints

If a future change requires whole-program semantic reasoning, that is usually a sign the work belongs in a focused JET view rather than the base source graph.

Current Maintenance Rules

These rules reflect the current architecture and are worth preserving.

Keep syntax and graph policy separate

  • syntax scan may discover raw facts
  • lowering may rebuild ownership and timelines
  • apply stages may decide graph identity and edge policy

Mixing these again is the fastest way to make the engine hard to change.

Prefer typed lowered IR over ad hoc Dict plumbing

Raw syntax facts still carry payload dictionaries, but the hot lowering path now uses typed structs for:

  • argument shapes
  • local value summaries
  • return summaries
  • call-site facts

When adding new behavior, prefer extending those lowered structs before inventing another free-form payload convention.

Do not reintroduce compatibility shells

Recent cleanup intentionally removed umbrella files and dead wrappers. If a file needs to be split again, prefer:

  • moving code
  • updating include order
  • deleting the old surface

over keeping temporary compatibility layers around forever.

Keep orchestration thin

build_base_graph in scan.jl should remain an orchestrator:

  • scan
  • semantic context
  • structure apply
  • overload annotation
  • constructor annotation
  • call apply
  • final edge annotation

If it starts gaining case-specific logic, that logic probably belongs in one of the dedicated phase files.

What Still Intentionally Remains Imperfect

The current architecture is much cleaner than before, but a few design seams are still intentional:

  • raw DASourceSyntaxFact.payload is still dictionary-based
  • external symbol indexing still has its own scan path
  • some graph metadata remains dictionary-shaped at the final node/edge boundary

Those are real seams, but they are no longer blocking day-to-day engine work.