Adaptive Environment#

Adaptive Environment Module#

This module defines the AdaptiveEnv class, which represents an adaptive environment for agents interacting with a resource distribution via influence kernels. The class models the competition of agents via their influence over the environment and computes gradients for optimization. It provides methods to compute influence, reward, and the gradients of rewards of each agent.

The module supports different types of influence kernels, including: - Gaussian - Jones - Dirichlet - Multi-variate Gaussian - Custom influence kernels (user-defined)

The AdaptiveEnv class supports gradient ascent methods for optimizing agent positions in the environment.

Dependencies:#

  • InflGame.utils

  • InflGame.kernels

  • InflGame.domains

Usage:#

The AdaptiveEnv class can be used to simulate and optimize agent interactions in an environment with resource distributions. It supports various influence kernel types and gradient ascent methods for optimization.

Example:#

from InflGame.adaptive.grad_func_env import AdaptiveEnv
import torch
import numpy as np

# Initialize the environment
env = AdaptiveEnv(
    num_agents=3,
    agents_pos=np.array([0.2, 0.5, 0.8]),
    parameters=torch.tensor([1.0, 1.0, 1.0]),
    resource_distribution=torch.tensor([10.0, 20.0, 30.0]),
    bin_points=np.array([0.1, 0.4, 0.7]),
    infl_configs={'infl_type': 'gaussian'},
    learning_rate_type='cosine',
    learning_rate=[0.0001, 0.01, 15],
    time_steps=100
    domain_type='1d',
    domain_bounds=[0, 1]
)

# Perform gradient ascent
env.gradient_ascent(show_out=True)

Classes

class InflGame.adaptive.grad_func_env.AdaptiveEnv(num_agents, agents_pos, parameters, resource_distribution, bin_points, infl_configs={'infl_type': 'gaussian'}, learning_rate_type='cosine', learning_rate=[0.0001, 0.01, 15], time_steps=100, fp=0, infl_cshift=False, cshift=0, infl_fshift=False, Q=0, domain_type='1d', domain_bounds=[0, 1], tolerance=1e-05, tolerated_agents=None, ignore_zero_infl=False)#

The AdaptiveEnv class represents an adaptive environment for agents interacting with a resource distribution via influence kernels. This class models the competition of agents via their influence over the environment and computes gradients for optimization. The class provides methods to compute influence, reward, and gradients based on the influence of agents on resource distribution. It also supports different types of influence kernels, including Gaussian, Jones, Dirichlet, and custom influence kernels.

Methods

d_lnf_matrix([parameter_instance])

Computes the derivative of the log of the influence function matrix , i.e. .

d_torch(parameter_instance)

Compute the gradient of the custom influence matrix using PyTorch autograd i.e.

gradient(parameter_instance)

Compute the gradient of the reward function \(u_i(x)\) with respect to agent positions x_i.

gradient_ascent([show_out, grad_modify, reward])

This is the helper function for performing gradient ascent for agents in the environment.

gradient_function(agents_pos, parameter_instance)

The gradient function computes the gradient of the reward function for a given set of agent positions and parameters.

influence_matrix([parameter_instance])

Compute the influence matrix for all agents using vectorized operations.

mv_gradient_ascent([show_out, grad_modify, ...])

Perform multi-variable gradient ascent for agents in the environment using the graident calculated by the function gradient.

prob_matrix([parameter_instance])

Computes the probability matrix for agents influencing a resource based on their influence kernel :math:'f_{i}(x_i,b_k)` computed by influence .

reward_F(parameter_instance)

Compute the expected reward for each agent given a reward distribution and all agents influence kernels.

shift_matrix(parameter_instance)

Compute the shift matrix for functional shifts in influence kerenels.

sv_gradient_ascent([show_out, grad_modify, ...])

The gradient ascent is performed by updating the agent positions based on the gradient and a learning rate.

d_lnf_matrix(parameter_instance=None)#

Computes the derivative of the log of the influence function matrix , i.e.

\[\frac{\partial}{\partial x_{(i,l)}}ln(f_{i}(x_i,b))=\frac{1}{f_{i}(x_{i},b)}\frac{\partial}{\partial x_{(i,l)}}f_{i}(x_{i},b)\]

The derivative matrix is a \(N \times K\) matrix where \(N\) is the number of agents and K is the number of bin/resource points. The entry \(\frac{\partial}{\partial x_i}ln(f_{i}(x_i,b_k))\) is a the derivative of the log of the influence of the \(i\) th agent on the \(k\) th bin/resource point. i.e.

\[\begin{split}\mathbf{D}=\begin{bmatrix} \frac{\partial}{\partial x_1}ln(f_{1}(x_1,b_1)) & \frac{\partial}{\partial x_1}ln(f_{1}(x_1,b_2)) & \cdots & \frac{\partial}{\partial x_1}ln(f_{1}(x_1,b_K)) \\ \frac{\partial}{\partial x_2}ln(f_{2}(x_2,b_1)) & \frac{\partial}{\partial x_2}ln(f_{2}(x_2,b_2)) & \cdots & \frac{\partial}{\partial x_2}ln(f_{2}(x_2,b_K)) \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial}{\partial x_N}ln(f_{N}(x_N,b_1)) & \frac{\partial}{\partial x_N}ln(f_{N}(x_N,b_2)) & \cdots & \frac{\partial}{\partial x_N}ln(f_{N}(x_N,b_K)) \end{bmatrix}\end{split}\]

This function only used for the prebuilt influence kernels from the paper in influence where the derivatives are analytically computed. -Gaussian influence kernel

(infl_type==’gaussian’)

  • Jones influence kernel

    (infl_type==’Jones_M’)

  • Dirichlet influence kernel

    (infl_type==’dirichlet’)

  • Multi-variate Gaussian influence kernel

    (infl_type==’multi_gaussian’)

  • Beta influence kernel

    (infl_type==’beta’)

For custom influence kernels use d_torch. This is automatically done if infl_type==custom_influence by the adaptive_env class.

Parameters:

parameter_instance (Union[List[float], np.ndarray, torch.Tensor]) – Parameters for the influence kernels.

Returns:

Derivative matrix.

Return type:

Union[int, torch.Tensor]

d_torch(parameter_instance)#

Compute the gradient of the custom influence matrix using PyTorch autograd i.e.

\[\frac{\partial}{\partial x_{(i,l)}}ln(f_{i}(x_i,b))=\frac{1}{f_{i}(x_{i},b)}\frac{\partial}{\partial x_{(i,l)}}f_{i}(x_{i},b)\]

if you using the infl_type=’custom’ influence kernel. This is done using PyTorch’s autograd functionality, so number of bin points must be larger enough to compute the gradient ( \(K\sim 100\) ).

If you are using a non-custom influence kernel, use d_lnf_matrix instead, this is automatically done by the adaptive_env class.

The derivative matrix is a \(N \times K\) matrix where \(N\) is the number of agents and K is the number of bin/resource points. The entry \(\frac{\partial}{\partial x_i}ln(f_{i}(x_i,b_k))\) is a the gradient of the log of the influence of the \(i\) th agent on the \(k\) th bin/resource point. i.e.

\[\begin{split}\mathbf{D}=\begin{bmatrix} \frac{\partial}{\partial x_1}ln(f_{1}(x_1,b_1)) & \frac{\partial}{\partial x_1}ln(f_{1}(x_1,b_2)) & \cdots & \frac{\partial}{\partial x_1}ln(f_{1}(x_1,b_K)) \\ \frac{\partial}{\partial x_2}ln(f_{2}(x_2,b_1)) & \frac{\partial}{\partial x_2}ln(f_{2}(x_2,b_2)) & \cdots & \frac{\partial}{\partial x_2}ln(f_{2}(x_2,b_K)) \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial}{\partial x_N}ln(f_{N}(x_N,b_1)) & \frac{\partial}{\partial x_N}ln(f_{N}(x_N,b_2)) & \cdots & \frac{\partial}{\partial x_N}ln(f_{N}(x_N,b_K)) \end{bmatrix}\end{split}\]
Parameters:

parameter_instance (Union[List[float], np.ndarray, torch.Tensor]) – Parameters for the influence kernels.

Returns:

derivative matrix.

Return type:

torch.Tensor

Raises:
  • ValueError – If input parameters are invalid or incompatible.

  • RuntimeError – If computation fails due to numerical issues.

  • TypeError – If input types are not supported.

  • NotImplementedError – If custom influence function is not properly configured.

gradient(parameter_instance)#

Compute the gradient of the reward function \(u_i(x)\) with respect to agent positions x_i. The gradient is computed as the elment-wise product of the derivative of the log of the influence function matrix and the probability matrix dot-producted with the resource vector \(\mathbf{B}\) . i.e.

\[\begin{split}\frac{\partial}{\partial x_{(i,l)}}u_i(x)=\sum_{k=1}^{K}G_{i,k}(x_i,b_k)\frac{\partial}{\partial x_{(i,l)}}ln(f_{i}(x_i,b_k))\\ =\left(\mathbf{G}\odot\mathbf{D}\right) \cdot \vec{B}\\\end{split}\]
\[\begin{split}\nabla\vec{R}=\left(\begin{bmatrix} G_{1,1} & G_{1,2} & \cdots & G_{1,K} \\ G_{2,1} & G_{2,2} & \cdots & G_{2,K} \\ \vdots & \vdots & \ddots & \vdots \\ G_{N,1} & G_{N,2} & \cdots & G_{N,K} \end{bmatrix} \odot \begin{bmatrix} \frac{\partial}{\partial x_1}ln(f_{1}(x_1,b_1)) & \frac{\partial}{\partial x_1}ln(f_{1}(x_1,b_2)) & \cdots & \frac{\partial}{\partial x_1}ln(f_{1}(x_1,b_K)) \\ \frac{\partial}{\partial x_2}ln(f_{2}(x_2,b_1)) & \frac{\partial}{\partial x_2}ln(f_{2}(x_2,b_2)) & \cdots & \frac{\partial}{\partial x_2}ln(f_{2}(x_2,b_K)) \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial}{\partial x_N}ln(f_{N}(x_N,b_1)) & \frac{\partial}{\partial x_N}ln(f_{N}(x_N,b_2)) & \cdots & \frac{\partial}{\partial x_N}ln(f_{N}(x_N,b_K)) \end{bmatrix} \right)\cdot \begin{bmatrix} B_1 \\ B_2 \\ \vdots \\ B_K \end{bmatrix}\end{split}\]

The matrix \(\mathbf{D}\) is the derivative of the log of the influence function matrix computed by d_lnf_matrix or d_torch . The probability matrix \(\mathbf{G}\) is computed by the function prob_matrix.

The output \(\nabla\vec{R}\) is a \(N \times L\) matrix where \(N\) is the number of agents and \(L\) is the number of dimensions. The entry \(\nabla\vec{R}_{i,l}\) is a the gradient of the reward of the \(i\) th agent on the \(l\) th dimension.

Parameters:

parameter_instance (Union[List[float], np.ndarray, torch.Tensor]) – Parameters for the influence kernels.

Returns:

Gradient values.

Return type:

torch.Tensor

Raises:
  • ValueError – If input parameters are invalid or incompatible.

  • RuntimeError – If computation fails due to numerical issues.

  • TypeError – If input types are not supported.

gradient_ascent(show_out=False, grad_modify=False, reward=True)#

This is the helper function for performing gradient ascent for agents in the environment. It calls the appropriate gradient ascent function based on the domain type. The gradient ascent is performed using the function sv_gradient_ascent or mv_gradient_ascent depending on the domain type.

Parameters:
  • show_out (bool) – Whether to return intermediate outputs.

  • grad_modify (bool) – Whether to modify gradients.

  • reward (bool) – Whether to compute rewards.

Returns:

Gradient ascent results.

Return type:

Optional[Union[torch.Tensor, Tuple[torch.Tensor, torch.Tensor, torch.Tensor]]]

Raises:
  • ValueError – If input parameters are invalid or incompatible.

  • RuntimeError – If computation fails due to numerical issues.

  • TypeError – If input types are not supported.

  • AttributeError – If required attributes are missing from the environment.

gradient_function(agents_pos, parameter_instance, ids=[0, 1], two_a=True)#

The gradient function computes the gradient of the reward function for a given set of agent positions and parameters. It is used to compute the gradient of the reward function for a specific set of agents, given a postion vector and parameters. The gradient is computed as the elment-wise product of the derivative of the log of the influence function matrix and the probability matrix dot-producted with the resource vector \(\mathbf{B}\) .

i.e.

\[\begin{split}\frac{\partial}{\partial x_{(i,l)}}u_i(x)=\sum_{b\in \mathbb{B}}^{K}B(b)G_{i}(x_i,b)\\ =\sum_{k=1}^{K}G_{i,k}(x_i,b_k)\frac{\partial}{\partial x_{(i,l)}}ln(f_{i}(x_i,b_k))\\ =\left(\mathbf{G}\odot\mathbf{D}\right) \cdot \vec{B}\\\end{split}\]
\[\begin{split}\nabla\vec{R}=\left(\begin{bmatrix} G_{1,1} & G_{1,2} & \cdots & G_{1,K} \\ G_{2,1} & G_{2,2} & \cdots & G_{2,K} \\ \vdots & \vdots & \ddots & \vdots \\ G_{N,1} & G_{N,2} & \cdots & G_{N,K} \end{bmatrix} \odot \begin{bmatrix} \frac{\partial}{\partial x_1}ln(f_{1}(x_1,b_1)) & \frac{\partial}{\partial x_1}ln(f_{1}(x_1,b_2)) & \cdots & \frac{\partial}{\partial x_1}ln(f_{1}(x_1,b_K)) \\ \frac{\partial}{\partial x_2}ln(f_{2}(x_2,b_1)) & \frac{\partial}{\partial x_2}ln(f_{2}(x_2,b_2)) & \cdots & \frac{\partial}{\partial x_2}ln(f_{2}(x_2,b_K)) \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial}{\partial x_N}ln(f_{N}(x_N,b_1)) & \frac{\partial}{\partial x_N}ln(f_{N}(x_N,b_2)) & \cdots & \frac{\partial}{\partial x_N}ln(f_{N}(x_N,b_K)) \end{bmatrix} \right)\cdot \begin{bmatrix} B_1 \\ B_2 \\ \vdots \\ B_K \end{bmatrix}\end{split}\]

The matrix \(\mathbf{D}\) is the derivative of the log of the influence function matrix computed by d_lnf_matrix or d_torch . The probability matrix \(\mathbf{G}\) is computed by the function prob_matrix.

The output \(\nabla\vec{R}\) is a \(N \times L\) matrix where \(N\) is the number of agents and \(L\) is the number of dimensions. The entry \(\nabla\vec{R}_{i,l}\) is a the gradient of the reward of the \(i\) th agent on the \(l\) th dimension.

Parameters:
  • agents_pos (Union[List[float], np.ndarray]) – Positions of the agents.

  • parameter_instance (Union[List[float], np.ndarray, torch.Tensor]) – Parameters for the influence kernels.

  • ids (List[int]) – IDs of the agents to compute gradients for.

  • two_a (bool) – Whether to compute gradients for all agents.

Returns:

Gradient values.

Return type:

torch.Tensor

Raises:
  • ValueError – If input parameters are invalid or incompatible.

  • RuntimeError – If computation fails due to numerical issues.

  • TypeError – If input types are not supported.

influence_matrix(parameter_instance=None)#

Compute the influence matrix for all agents using vectorized operations.

This function computes the influence values for all agents across all bin points, with optional constant and functional shifts. The function supports multiple influence kernel types and provides comprehensive error handling.

Parameters:

parameter_instance (Union[List[float], np.ndarray, torch.Tensor]) – Parameters for the influence kernels.

Returns:

Influence matrix of shape (N, K) or (N+shifts, K) where N is number of agents, K is number of bin points, and shifts are additional rows for constant/functional shifts.

Return type:

torch.Tensor

Raises:
  • ValueError – If input parameters are invalid or incompatible.

  • RuntimeError – If computation fails due to numerical issues.

  • TypeError – If input types are not supported.

  • NotImplementedError – If functional shift is requested for multi-dimensional agents.

mv_gradient_ascent(show_out=False, grad_modify=False, reward=True)#

Perform multi-variable gradient ascent for agents in the environment using the graident calculated by the function gradient.

The gradient ascent is performed by updating the agent positions based on the gradient and a learning rate. The learning rate is scheduled using the function InflGame.utils.general.learning_rate. The alogrithm is performed for a fixed number of time steps or until the agents converge to a solution within a specified tolerance for the absoulte difference between the current and previous agent positions. The gradient ascent is performed in the following steps:

  1. Compute the gradient of the reward function using the function gradient.

  2. Normalize the gradient if the domain type is ‘simplex’.

    • For simplex, the gradient is normalized to ensure that the agent positions remain within the simplex.

  3. Update the agent positions using the gradient and the learning rate.

    • The learning rate is computed using the function InflGame.utils.general.learning_rate.

    • The agent positions are updated by adding the gradient multiplied by the learning rate to the current agent positions.

    • The updated agent positions are projected onto the simplex if the domain type is ‘simplex’.

  4. Store the agent positions, gradients, and rewards at each time step.

  5. Check for convergence by computing the absolute difference between the current and previous agent positions.

  6. If the absolute difference is less than the specified tolerance (var:tolerance) for a set number of agents (var:tolarated_agents), break the loop.

i.e. a time step looks liek this:

\[\begin{split}\mathbf{x}_{i;t+1}=\mathbf{x}_{i;t}+\eta_t\cdot\nabla\vec{R}_{i;t}\\\end{split}\]

with the stop condition:

\[\begin{split}\sum_{i=1}^{N}||\mathbf{x}_{i;t+1}-\mathbf{x}_{i;t}||_1\leq \epsilon = E\\\end{split}\]

where \(\epsilon\) is the tolerance and \(E\) is the tolerated agents. The learning rate \(\eta_t\) is computed using the function InflGame.utils.general.learning_rate.

If the domain type is ‘simplex’, the agent positions are projected onto the simplex so the update step looks like this:

\[\begin{split}\mathbf{x}_{i;t+1}=\mathbf{P}_{\Delta}(\mathbf{x}_{i;t}+\eta_t\cdot normalized(\nabla \vec{R}_{i;t}))\\\end{split}\]

using the function InflGame.domains.simplex.simplex_utils.projection_onto_simplex.

Due to the normalization of the gradient, the stoping condtions is slightly different:

\[\begin{split}\sum_{i=1}^{N}||\mathbf{x}_{i;t+5}-\mathbf{x}_{i;t}||_1\leq \epsilon = E\\\end{split}\]
Parameters:
  • show_out (bool) – Whether to return intermediate outputs.

  • grad_modify (bool) – Whether to modify gradients.

  • reward (bool) – Whether to compute rewards.

Returns:

Gradient ascent results.

Return type:

torch.Tensor

Raises:
  • ValueError – If input parameters are invalid or incompatible.

  • RuntimeError – If computation fails due to numerical issues.

  • TypeError – If input types are not supported.

prob_matrix(parameter_instance=None)#

Computes the probability matrix for agents influencing a resource based on their influence kernel :math:’f_{i}(x_i,b_k)` computed by influence . Where the probability of agent \(i\) influencing a bin/resource point is defined as

\[G_{i,k}(\mathbf{x},b_k)=\frac{f_{i}(x_i,b_k)}{\sum_{j=1}^{N}f_{j}(x_j,b_k)}.\]

The probability matrix is a \(N \times K\) matrix where \(N\) is the number of agents and K is the number of bin/resource points. The entry \(G_{i,k}\) is a the probability of the \(i\) th agent on the \(k\) th bin/resource point. i.e.

\[\begin{split}\begin{bmatrix} G_{1,1} & G_{1,2} & \cdots & G_{1,K} \\ G_{2,1} & G_{2,2} & \cdots & G_{2,K} \\ \vdots & \vdots & \ddots & \vdots \\ G_{N,1} & G_{N,2} & \cdots & G_{N,K} \end{bmatrix}\end{split}\]
Parameters:

parameter_instance (Union[List[float], np.ndarray, torch.Tensor]) – Parameters for the influence kernels.

Returns:

Probability matrix.

Return type:

torch.Tensor

Raises:
  • ValueError – If input dimensions are incompatible or contain invalid values.

  • RuntimeError – If computation fails due to numerical issues.

  • TypeError – If input types are not supported.

reward_F(parameter_instance)#

Compute the expected reward for each agent given a reward distribution and all agents influence kernels. The probability of an agent influencing a point is their relative influence over the bin points.

Then reward is computed as matrix multiplication of the probability matrix and the resource distribution. i.e.

\[\begin{split}u_i&=\sum_{b\in \mathbb{B}}^{K}B(b) G_{i}(x_i,b))\\ &=\sum_{k=1}^{K}B_k G_{i,k}(x_i,b_k)\\ &=\begin{bmatrix} G_{1,1} & G_{1,2} & \cdots & G_{1,K} \\ G_{2,1} & G_{2,2} & \cdots & G_{2,K} \\ \vdots & \vdots & \ddots & \vdots \\ G_{N,1} & G_{N,2} & \cdots & G_{N,K} \end{bmatrix} \begin{bmatrix} B_1 \\ B_2 \\ \vdots \\ B_K \end{bmatrix}\end{split}\]

if \(\mathbb{B}=\set{b_1,b_2,\cdots,b_K}\) is the set of bin points and \(B_k=B(b_k)\) is the resource at bin point \(b_k\).

The probability matrix is computed by the function prob_matrix .

Parameters:

parameter_instance (Union[List[float], np.ndarray, torch.Tensor]) – Parameters for the influence kernels.

Returns:

Reward values for agents.

Return type:

torch.Tensor

Raises:
  • ValueError – If input dimensions are incompatible or contain invalid values.

  • RuntimeError – If computation fails due to numerical issues.

  • TypeError – If input types are not supported.

shift_matrix(parameter_instance)#

Compute the shift matrix for functional shifts in influence kerenels. This function is mostly used for abstaining voters and fixed party examples, but can also be used for demonstrating how types of non-symmetry can impact agent behavior. The shift matrix is a \(N \times K\) matrix where \(N\) is the number of agents and K is the number of bin/resource points. The entry \(s_{i,k}\) is a the shift of the \(i\) th agent on the \(k\) th bin/resource point. i.e.

\[\begin{split}\begin{bmatrix} s_{1,1} & s_{1,2} & \cdots & s_{1,K} \\ s_{2,1} & s_{2,2} & \cdots & s_{2,K} \\ \vdots & \vdots & \ddots & \vdots \\ s_{N,1} & s_{N,2} & \cdots & s_{N,K} \end{bmatrix}\end{split}\]

there are different types of shifts that can be applied to the influence matrix.

  • Constant shift (infl_cshift=True)

    \[s_{i,k}=cshift\]
  • Functional shift (infl_fshift=True)’

    An example of a functional shift is the abstaining voter model where the shift is defined as:

    \[\begin{split}s_{i,k}=-2Q\prod_{\substack{j=1\\ j\neq i}}^{N} (b_k-x_j)^2(b_k-x_i)\end{split}\]

    i.e. the influence of an abstaining voter on point \(b_k\) is

    \[s_{i}(x_i,b_k)=\prod_{i=1}^{N} Q(b_k-x_j)^2\]

    where \(x_i\) is the position of the \(i\) th agent and \(b_k\) is the \(k\) th bin point. \(Q\) is a scaling factor for the functional shift.

Parameters:

parameter_instance (Union[List[float], np.ndarray, torch.Tensor]) – Parameters for the influence kernels.

Returns:

Shift matrix.

Return type:

torch.Tensor

Raises:
  • ValueError – If input dimensions are incompatible or invalid.

  • RuntimeError – If computation fails due to numerical issues.

sv_gradient_ascent(show_out=False, grad_modify=False, reward=True)#

The gradient ascent is performed by updating the agent positions based on the gradient and a learning rate. The learning rate is scheduled using the function InflGame.utils.general.learning_rate. The alogrithm is performed for a fixed number of time steps or until the agents converge to a solution within a specified tolerance for the absoulte difference between the current and previous agent positions. The gradient ascent is performed in the following steps:

  1. Compute the gradient of the reward function using the function gradient.

  2. Update the agent positions using the gradient and the learning rate.

    • The learning rate is computed using the function InflGame.utils.general.learning_rate.

    • The agent positions are updated by adding the gradient multiplied by the learning rate to the current agent positions.

  1. Store the agent positions, gradients, and rewards at each time step.

  2. Check for convergence by computing the absolute difference between the current and previous agent positions.

  3. If the absolute difference is less than the specified tolerance (var:tolerance) for a set number of agents (var:tolarated_agents), break the loop.

i.e. a time step looks liek this:

\[\begin{split}\mathbf{x}_{i;t+1}=\mathbf{x}_{i;t}+\eta_t\cdot\nabla\vec{R}_{i;t}\\\end{split}\]

with the stop condition:

\[\begin{split}\sum_{i=1}^{N}||\mathbf{x}_{i;t+1}-\mathbf{x}_{i;t}||_1\leq \epsilon = E\\\end{split}\]

where \(\epsilon\) is the tolerance and \(E\) is the tolerated agents. The learning rate \(\eta_t\) is computed using the function InflGame.utils.general.learning_rate.

Parameters:
  • show_out (bool) – Whether to return intermediate outputs.

  • grad_modify (bool) – Whether to modify gradients.

  • reward (bool) – Whether to compute rewards.

Returns:

Gradient ascent results.

Return type:

torch.Tensor

Raises:
  • ValueError – If input parameters are invalid or incompatible.

  • RuntimeError – If computation fails due to numerical issues.

  • TypeError – If input types are not supported.