Orchestration
EdgeOrchestrator
-
class EdgeOrchestrator : public Application
Orchestrator for edge computing workloads with admission control and scheduling.
EdgeOrchestrator manages the execution of computational tasks on a cluster of backend workers. It provides:
Admission control via pluggable AdmissionPolicy
Task scheduling via pluggable Scheduler
The orchestrator supports mixed task types through a task type registry. Each task type is registered via RegisterTaskType() with its deserializer callbacks, enabling DAGs containing different task types (e.g., ImageTask and LlmTask in the same workflow).
Example usage:
Ptr<EdgeOrchestrator> orchestrator = CreateObject<EdgeOrchestrator>(); orchestrator->SetCluster(cluster); orchestrator->SetAttribute("Scheduler", PointerValue(scheduler)); orchestrator->SetAttribute("AdmissionPolicy", PointerValue(policy));
Public Types
-
typedef Callback<Ptr<Task>, Ptr<Packet>, uint64_t&> DeserializerCallback
Callback type for deserializing tasks from packet buffers.
The deserializer handles both boundary detection and task creation. It knows its own header format, so it can determine message boundaries.
- Param packet:
The packet buffer (may contain multiple messages or partial data).
- Param consumedBytes:
Output: bytes consumed from packet (0 if not enough data).
- Return:
The deserialized task, or nullptr if not enough data for complete message.
-
typedef void (*WorkloadAdmittedTracedCallback)(uint64_t workloadId, uint32_t taskCount)
TracedCallback signature for workload admitted events.
- Param workloadId:
The admitted workload ID.
- Param taskCount:
Number of tasks in the workload.
-
typedef void (*WorkloadRejectedTracedCallback)(uint32_t taskCount, const std::string &reason)
TracedCallback signature for workload rejected events.
- Param taskCount:
Number of tasks in the rejected workload.
- Param reason:
Human-readable rejection reason.
-
typedef void (*TaskDispatchedTracedCallback)(uint64_t workloadId, uint64_t taskId, uint32_t backendIdx)
TracedCallback signature for task dispatched events.
- Param workloadId:
The workload this task belongs to.
- Param taskId:
The dispatched task ID.
- Param backendIdx:
The backend index the task was dispatched to.
-
typedef void (*TaskCompletedTracedCallback)(uint64_t workloadId, uint64_t taskId, uint32_t backendIdx)
TracedCallback signature for task completed events.
- Param workloadId:
The workload this task belongs to.
- Param taskId:
The completed task ID.
- Param backendIdx:
The backend index that processed the task.
-
typedef void (*WorkloadCancelledTracedCallback)(uint64_t workloadId)
TracedCallback signature for workload cancelled events.
- Param workloadId:
The cancelled workload ID.
-
typedef void (*WorkloadCompletedTracedCallback)(uint64_t workloadId)
TracedCallback signature for workload completed events.
- Param workloadId:
The completed workload ID.
Public Functions
-
void SetCluster(const Cluster &cluster)
Set the cluster of backend workers.
- Parameters:
cluster – The cluster configuration.
-
void RegisterTaskType(uint8_t taskType, DeserializerCallback fullDeserializer, DeserializerCallback metadataDeserializer)
Register deserializers for a task type.
Associates a 1-byte task type identifier with full and metadata-only deserializer callbacks. Must be called before StartApplication() for custom task types. If no types are registered, SimpleTask is registered automatically at startup.
- Parameters:
taskType – The task type identifier (from Task::GetTaskType()).
fullDeserializer – Callback to deserialize a complete task (header + payload).
metadataDeserializer – Callback to deserialize task metadata only (header).
-
uint64_t GetWorkloadsAdmitted() const
Get number of workloads admitted.
- Returns:
Count of admitted workloads.
-
uint64_t GetWorkloadsRejected() const
Get number of workloads rejected.
- Returns:
Count of rejected workloads.
-
uint64_t GetWorkloadsCompleted() const
Get number of workloads completed.
- Returns:
Count of completed workloads.
-
uint32_t GetActiveWorkloadCount() const
Get number of active workloads.
- Returns:
Count of currently executing workloads.
-
uint64_t GetWorkloadsCancelled() const
Get number of workloads cancelled.
- Returns:
Count of cancelled workloads (e.g., due to client disconnect).
-
Ptr<ClusterScheduler> GetScheduler() const
Get the configured scheduler.
- Returns:
The scheduler, or nullptr if not set.
-
Ptr<AdmissionPolicy> GetAdmissionPolicy() const
Get the configured admission policy.
- Returns:
The admission policy, or nullptr if not set (always admit).
Public Static Functions
-
static TypeId GetTypeId()
Get the type ID.
- Returns:
The object TypeId.
-
struct TaskTypeEntry
Entry in the task type registry.
Public Members
-
DeserializerCallback fullDeserializer
Full task deserializer (header + payload)
-
DeserializerCallback metadataDeserializer
Header-only deserializer.
-
DeserializerCallback fullDeserializer
Cluster
-
class Cluster
Represents a cluster of backend server nodes for distributed computing.
A Cluster holds references to server nodes and their addresses. It provides iteration and access patterns similar to NodeContainer. The Cluster is used by ClusterScheduler implementations to select which backend should handle incoming tasks.
Each backend in the cluster is represented by a Backend struct containing:
A pointer to the server Node (which may have a GpuAccelerator aggregated)
The network Address (InetSocketAddress with IP and port) for TCP connections
Example usage:
Cluster cluster; cluster.AddBackend(serverNode1, InetSocketAddress(addr1, 9000)); cluster.AddBackend(serverNode2, InetSocketAddress(addr2, 9000)); cluster.AddBackend(serverNode3, InetSocketAddress(addr3, 9000)); for (auto it = cluster.Begin(); it != cluster.End(); ++it) { Ptr<Node> node = it->node; Address addr = it->address; }
Public Types
Public Functions
-
Cluster()
Create an empty cluster.
-
void AddBackend(Ptr<Node> node, const Address &address, const std::string &acceleratorType = "")
Add a backend to the cluster.
- Parameters:
node – The backend node. This node should have an application installed and a GpuAccelerator aggregated.
address – The address clients should connect to (typically InetSocketAddress with the server’s IP and port).
acceleratorType – The type of accelerator on this backend (e.g., “GPU”, “TPU”). Empty string means any/unspecified.
-
uint32_t GetN() const
Get the number of backends in the cluster.
- Returns:
The number of backend servers.
-
Iterator Begin() const
Get an iterator to the first backend.
- Returns:
Iterator pointing to the first backend, or End() if empty.
-
Iterator End() const
Get an iterator past the last backend.
- Returns:
Iterator pointing past the last backend.
-
Iterator begin() const
Get an iterator to the first backend (lowercase for range-based for).
- Returns:
Iterator pointing to the first backend, or end() if empty.
-
Iterator end() const
Get an iterator past the last backend (lowercase for range-based for).
- Returns:
Iterator pointing past the last backend.
-
bool IsEmpty() const
Check if the cluster is empty.
- Returns:
true if the cluster has no backends, false otherwise.
-
void Clear()
Remove all backends from the cluster.
-
const std::vector<uint32_t> &GetBackendsByType(const std::string &acceleratorType) const
Get backend indices for a specific accelerator type.
Returns a vector of indices into the cluster for backends matching the specified accelerator type. This enables efficient scheduling decisions without iterating through all backends.
- Parameters:
acceleratorType – The accelerator type to filter by (e.g., “GPU”, “TPU”).
- Returns:
Vector of backend indices matching the type. Empty if none match.
-
bool HasAcceleratorType(const std::string &acceleratorType) const
Check if the cluster has backends of a specific accelerator type.
- Parameters:
acceleratorType – The accelerator type to check for.
- Returns:
true if at least one backend has this accelerator type.
-
struct Backend
Information about a backend server in the cluster.
Public Members
-
Ptr<Node> node
The backend server node (may have GpuAccelerator aggregated)
-
Address address
Server address (InetSocketAddress with IP and port)
-
std::string acceleratorType
Type of accelerator (e.g., “GPU”, “TPU”). Empty = any.
-
Ptr<Node> node
ClusterState
-
class ClusterState
Centralized view of per-backend load and device metrics for decision-makers.
ClusterState is a plain data container owned by EdgeOrchestrator. It aggregates orchestrator-tracked dispatch/completion counts and device-reported metrics into a single object that is passed to ScalingPolicy, ClusterScheduler, and AdmissionPolicy on each call.
Public Functions
-
void Resize(uint32_t n)
Resize the backend state vector.
- Parameters:
n – Number of backends.
-
uint32_t GetN() const
Get the number of backends.
- Returns:
Number of backends.
-
const BackendState &Get(uint32_t idx) const
Get the state of a specific backend.
- Parameters:
idx – Backend index.
- Returns:
Const reference to the backend state.
-
void NotifyTaskDispatched(uint32_t backendIdx)
Record that a task was dispatched to a backend.
- Parameters:
backendIdx – The backend index.
-
void NotifyTaskCompleted(uint32_t backendIdx)
Record that a task completed on a backend.
- Parameters:
backendIdx – The backend index.
-
void SetDeviceMetrics(uint32_t backendIdx, Ptr<DeviceMetrics> metrics)
Store device metrics for a backend.
- Parameters:
backendIdx – The backend index.
metrics – The device metrics.
-
void SetCommandedFrequency(uint32_t backendIdx, double frequency)
Set the commanded frequency for a backend.
- Parameters:
backendIdx – The backend index.
frequency – The commanded frequency in Hz.
-
void SetActiveWorkloadCount(uint32_t count)
Set the active workload count.
- Parameters:
count – Number of active workloads.
-
uint32_t GetActiveWorkloadCount() const
Get the active workload count.
- Returns:
Number of active workloads.
-
void Clear()
Clear all state.
-
struct BackendState
Per-backend state combining orchestrator-tracked load and device metrics.
Public Members
-
uint32_t activeTasks = {0}
Dispatched but not yet completed.
-
uint32_t totalDispatched = {0}
Lifetime dispatch count.
-
uint32_t totalCompleted = {0}
Lifetime completion count.
-
double commandedFrequency = {0.0}
Last frequency commanded by DeviceManager.
-
Ptr<DeviceMetrics> deviceMetrics
Latest device-reported metrics (nullable)
-
uint32_t activeTasks = {0}
-
void Resize(uint32_t n)
ClusterScheduler
-
class ClusterScheduler : public Object
Abstract base class for task scheduling policies.
ClusterScheduler determines which backend in a cluster should execute a given task.
ClusterScheduler is used by EdgeOrchestrator for task placement decisions during DAG execution.
Example usage:
Ptr<ClusterScheduler> scheduler = CreateObject<FirstFitScheduler>(); Ptr<Task> task = CreateObject<SimpleTask>(); task->SetRequiredAcceleratorType("GPU"); int32_t backendIdx = scheduler->ScheduleTask(task, cluster); if (backendIdx >= 0) { // Dispatch task to cluster.Get(backendIdx) }
Subclassed by ns3::FirstFitScheduler, ns3::LeastLoadedScheduler
Public Functions
-
virtual int32_t ScheduleTask(Ptr<Task> task, const Cluster &cluster, const ClusterState &state) = 0
Select a backend to execute the given task.
Examines the task’s requirements (accelerator type, compute demand, etc.) and cluster state to select an appropriate backend.
- Parameters:
task – The task to schedule.
cluster – The cluster of available backends.
state – Per-backend load and device metrics.
- Returns:
Index into cluster (0 to GetN()-1), or -1 if no suitable backend found.
-
virtual bool CanScheduleTask(Ptr<Task> task, const Cluster &cluster, const ClusterState &state) const
Check if a task can be scheduled without side effects.
Returns true if ScheduleTask() would find a suitable backend. Default implementation checks if any backend matches the task’s required accelerator type. Other scheduler’s can override with custom behaviour.
- Parameters:
task – The task to check.
cluster – The cluster of available backends.
state – Per-backend load and device metrics.
- Returns:
true if the task can be scheduled, false otherwise.
-
virtual void NotifyTaskCompleted(uint32_t backendIdx, Ptr<Task> task)
Notify scheduler that a task completed on a backend.
Called when a task finishes execution. Stateful schedulers can use this to update internal state (e.g., decrement pending count, track latency). Default implementation does nothing.
- Parameters:
backendIdx – The backend index where the task completed.
task – The task that completed.
-
virtual std::string GetName() const = 0
Get the scheduler name for logging and debugging.
- Returns:
A string identifying this scheduler type.
Public Static Functions
-
static TypeId GetTypeId()
Get the type ID.
- Returns:
The object TypeId.
-
virtual int32_t ScheduleTask(Ptr<Task> task, const Cluster &cluster, const ClusterState &state) = 0
FirstFitScheduler
-
class FirstFitScheduler : public ns3::ClusterScheduler
Simple scheduler that selects the first suitable backend.
FirstFitScheduler iterates through backends in round-robin order and selects the first one that matches the task’s accelerator requirements.
Public Functions
-
virtual int32_t ScheduleTask(Ptr<Task> task, const Cluster &cluster, const ClusterState &state) override
Select a backend for the task using first-fit with round-robin.
Starts from the last selected index and finds the first backend that matches the task’s accelerator requirement. Returns -1 if no match found.
- Parameters:
task – The task to schedule.
cluster – The cluster of backends.
state – The current state of the cluster.
- Returns:
Backend index, or -1 if no suitable backend.
-
virtual std::string GetName() const override
Get the scheduler name.
- Returns:
“FirstFit”
Public Static Functions
-
static TypeId GetTypeId()
Get the type ID.
- Returns:
The object TypeId.
-
virtual int32_t ScheduleTask(Ptr<Task> task, const Cluster &cluster, const ClusterState &state) override
LeastLoadedScheduler
-
class LeastLoadedScheduler : public ns3::ClusterScheduler
Scheduler that selects the backend with the fewest active tasks.
LeastLoadedScheduler picks the backend with the minimum number of active tasks (dispatched but not yet completed), as tracked in ClusterState. If the task specifies a required accelerator type, only matching backends are considered. Ties are broken by uniform random selection.
Public Functions
-
virtual int32_t ScheduleTask(Ptr<Task> task, const Cluster &cluster, const ClusterState &state) override
Select the least-loaded backend for the task.
- Parameters:
task – The task to schedule.
cluster – The cluster of backends.
state – Per-backend load state.
- Returns:
Backend index, or -1 if no suitable backend.
-
virtual std::string GetName() const override
Get the scheduler name.
- Returns:
“LeastLoaded”
Public Static Functions
-
static TypeId GetTypeId()
Get the type ID.
- Returns:
The object TypeId.
-
virtual int32_t ScheduleTask(Ptr<Task> task, const Cluster &cluster, const ClusterState &state) override
AdmissionPolicy
-
class AdmissionPolicy : public Object
Abstract base class for admission control policies.
AdmissionPolicy determines whether a workload should be accepted for execution.
Implementations are stateless - the orchestrator tracks active workloads and passes the count to ShouldAdmit(). This follows real-world patterns from Kubernetes and Spark where admission controllers are stateless.
Example usage:
Ptr<AdmissionPolicy> policy = CreateObject<AlwaysAdmitPolicy>(); if (policy->ShouldAdmit(dag, cluster, activeCount)) { // Accept workload... }
Subclassed by ns3::AlwaysAdmitPolicy, ns3::DeadlineAwareAdmissionPolicy, ns3::MaxActiveTasksPolicy
Public Functions
-
virtual bool ShouldAdmit(Ptr<DagTask> dag, const Cluster &cluster, const ClusterState &state) = 0
Check if a workload should be admitted.
This method evaluates the workload against the current cluster state and policy rules to decide admission.
- Parameters:
dag – The workload DAG (single tasks are wrapped as 1-node DAGs)
cluster – Current cluster state with available backends
state – Per-backend load and device metrics
- Returns:
true if the workload should be admitted, false if rejected
-
virtual std::string GetName() const = 0
Get the policy name for logging and debugging.
- Returns:
A string identifying this policy type.
Public Static Functions
-
static TypeId GetTypeId()
Get the type ID.
- Returns:
The object TypeId.
-
virtual bool ShouldAdmit(Ptr<DagTask> dag, const Cluster &cluster, const ClusterState &state) = 0
AlwaysAdmitPolicy
-
class AlwaysAdmitPolicy : public ns3::AdmissionPolicy
Admission policy that always admits workloads.
AlwaysAdmitPolicy is a baseline policy that accepts all workloads regardless of cluster state or workload metrics.
Public Functions
-
virtual bool ShouldAdmit(Ptr<DagTask> dag, const Cluster &cluster, const ClusterState &state) override
Always admits the workload.
- Parameters:
dag – The workload DAG (ignored)
cluster – Current cluster state (ignored)
state – Per-backend load and device metrics (ignored)
- Returns:
Always returns true
-
virtual std::string GetName() const override
Get the policy name.
- Returns:
“AlwaysAdmit”
Public Static Functions
-
static TypeId GetTypeId()
Get the type ID.
- Returns:
The object TypeId.
-
virtual bool ShouldAdmit(Ptr<DagTask> dag, const Cluster &cluster, const ClusterState &state) override
MaxActiveTasksPolicy
-
class MaxActiveTasksPolicy : public ns3::AdmissionPolicy
Type-aware admission policy that rejects workloads when compatible backends are at capacity.
MaxActiveTasksPolicy checks whether backends compatible with the workload’s required accelerator types have fewer active tasks than the configured threshold. For each required type, at least one matching backend must have capacity. Tasks with no required type are matched against any backend.
Public Functions
-
virtual bool ShouldAdmit(Ptr<DagTask> dag, const Cluster &cluster, const ClusterState &state) override
Admit if compatible backends have capacity for each required task type.
- Parameters:
dag – The workload DAG (inspected for required accelerator types).
cluster – The cluster (used for type-based backend lookup).
state – Per-backend load and device metrics.
- Returns:
true if capacity exists for all required types, false otherwise.
-
virtual std::string GetName() const override
Get the policy name.
- Returns:
“MaxActiveTasks”
Public Static Functions
-
static TypeId GetTypeId()
Get the type ID.
- Returns:
The object TypeId.
-
virtual bool ShouldAdmit(Ptr<DagTask> dag, const Cluster &cluster, const ClusterState &state) override
DeadlineAwareAdmissionPolicy
-
class DeadlineAwareAdmissionPolicy : public ns3::AdmissionPolicy
Admission policy that rejects workloads with infeasible deadlines.
Estimates whether each task in the DAG can complete before its deadline given current backend load and DAG dependency structure. A task cannot start until all its predecessors complete, so the earliest start time accounts for the critical path through the DAG.
Tasks without deadlines are always feasible. If any task cannot meet its deadline on any matching backend, the entire workload is rejected.
Public Functions
-
virtual bool ShouldAdmit(Ptr<DagTask> dag, const Cluster &cluster, const ClusterState &state) override
Admit if all tasks with deadlines can be met on at least one backend.
- Parameters:
dag – The workload DAG
cluster – Current cluster state
state – Per-backend load and device metrics
- Returns:
true if all deadline-bearing tasks are feasible
-
virtual std::string GetName() const override
Get the policy name.
- Returns:
“DeadlineAware”
Public Static Functions
-
static TypeId GetTypeId()
Get the type ID.
- Returns:
The object TypeId.
-
virtual bool ShouldAdmit(Ptr<DagTask> dag, const Cluster &cluster, const ClusterState &state) override