Morphling requires users to specify the ProflingExperiment interface for configuration tuning, including:
-
ML model container (e.g., the Pod template),
-
performance objective function,
-
tunable configuration parameters with types and search range,
-
sampling algorithms,
-
sampling budget.
type ProfilingExperimentSpec struct {
ServicePodTemplate corev1.PodTemplate `json:"servicePodTemplate,omitempty"`
Objective ObjectiveSpec `json:"objective,omitempty"`
TunableParameters []ParameterCategory `json:"tunableParameters,omitempty"`
Algorithm AlgorithmSpec `json:"algorithm,omitempty"`
MaxNumTrials *int32 `json:"maxNumTrials,omitempty"`
}The ProflingExperiment workflow looks as follows:
-
A user submits a
ProflingExperimentvia a RPC or front-end UI interface, specifying the ML model, tunable configuration parameters, optimization objectives, and sampling budgets. -
Within the sampling budget, Morphling iteratively communicates with the algorithm server to get the next configuration for sampling.
-
Then Morphling starts a
Trialto evaluate that sampling. -
When performing a
Trial, a model serving inference instanceDeploymentis launched, and its “readiness” is reported to trigger a client-side RPS stress-testJob. -
After the client
Jobcompletes, the measured peak RPS is stored in theDB. -
A
Trialfinishes, and the result is sent to theProflingExperiment. -
The
ProflingExperimentcompletes when the sampling budget is reached.
The sequence diagram of the ProflingExperiment workfolow is shown as follows:

