Crate starpu_sys

Crate starpu_sys 

Source
Expand description

§starpu-sys: Low-level bindings to StarPU

MPL licensed Package on crates.io Documentation Continuous Integration Requires rustc 1.82.0+

This crate contains unsafe Rust bindings to the C API of StarPU.

Using these bindings directly is basically writing C in Rust syntax, which is neither idiomatic nor safe. But the intent is to later build an idiomatic safe Rust API on top of these bindings, in a separate starpu crate.

Installation instructions and a project overview can be found in the source repository’s toplevel README.

Re-exports§

pub use cl_sys;
pub use hwlocality_sys;
pub use libc;

Structs§

__BindgenBitfieldUnit
__BindgenUnionField
__va_list_tag
_starpu_data_state
@defgroup API_Data_Management Data Management @brief Data management facilities provided by StarPU. We show how to use existing data interfaces in \ref API_Data_Interfaces, but developers can design their own data interfaces if required. @{
_starpu_perfmodel_state
_starpu_task_bundle
_starpu_trs_epoch
drand48_data
starpu_arbiter
starpu_bcsr_interface
BCSR interface for sparse matrices (blocked compressed sparse row representation)
starpu_bitmap
todo
starpu_block_interface
Block interface for 3D dense blocks
starpu_cluster_machine
@deprecated Use starpu_parallel_worker_config
starpu_codelet
The codelet structure describes a kernel that is possibly implemented on various targets. For compatibility, make sure to initialize the whole structure to zero, either by using explicit memset, or the function starpu_codelet_init(), or by letting the compiler implicitly do it in e.g. static storage case.
starpu_codelet_pack_arg_data
Structure to be used for starpu_codelet_pack_arg_init() & co, and starpu_codelet_unpack_arg_init() & co. The contents is public, however users should not directly access it, but only use as a parameter to the appropriate functions.
starpu_conf
Structure passed to the starpu_init() function to configure StarPU. It has to be initialized with starpu_conf_init(). When the default value is used, StarPU automatically selects the number of processing units and takes the default scheduling policy. The environment variables overwrite the equivalent parameters unless starpu_conf::precedence_over_environment_variables is set.
starpu_coo_interface
COO Matrices
starpu_csr_interface
CSR interface for sparse matrices (compressed sparse row representation)
starpu_data_copy_methods
Define the per-interface methods. If the starpu_data_copy_methods::any_to_any method is provided, it will be used by default if no specific method is provided. It can still be useful to provide more specific method in case of e.g. available particular CUDA, HIP or OpenCL support.
starpu_data_descr
Describe a data handle along with an access mode.
starpu_data_filter
Describe a data partitioning operation, to be given to starpu_data_partition(). See \ref DefiningANewDataFilter for more details.
starpu_data_interface_ops
@defgroup API_Data_Partition Data Partition @{
starpu_disk_ops
Set of functions to manipulate data on disk. See \ref DiskFunctions for more details.
starpu_driver
structure for designating a given driver. See \ref UsingTheDriverAPI for more details.
starpu_driver__bindgen_ty_1
Identifier of the driver.
starpu_fxt_codelet_event
todo
starpu_fxt_mpi_offset
Store information related to clock synchronizations: mainly the offset to apply to each time.
starpu_fxt_options
todo
starpu_matrix_interface
Matrix interface for dense matrices
starpu_multiformat_data_interface_ops
Multiformat operations
starpu_multiformat_interface
todo
starpu_ndim_interface
ndim interface for ndim array
starpu_omp_lock_t
Opaque Simple Lock object (\anchor SimpleLock) for inter-task synchronization operations. \sa starpu_omp_init_lock() \sa starpu_omp_destroy_lock() \sa starpu_omp_set_lock() \sa starpu_omp_unset_lock() \sa starpu_omp_test_lock()
starpu_omp_nest_lock_t
Opaque Nestable Lock object (\anchor NestableLock) for inter-task synchronization operations. \sa starpu_omp_init_nest_lock() \sa starpu_omp_destroy_nest_lock() \sa starpu_omp_set_nest_lock() \sa starpu_omp_unset_nest_lock() \sa starpu_omp_test_nest_lock()
starpu_omp_parallel_region_attr
Set of attributes used for creating a new parallel region. \sa starpu_omp_parallel_region()
starpu_omp_task
@private This is private to StarPU, do not modify.
starpu_omp_task_region_attr
Set of attributes used for creating a new task region. \sa starpu_omp_task_region()
starpu_opencl_program
Store the OpenCL programs as compiled for the different OpenCL devices.
starpu_parallel_worker_config
Parallel_Worker configuration
starpu_perf_counter_listener
starpu_perf_counter_sample
starpu_perf_counter_sample_cl_values
starpu_perf_counter_set
starpu_perfmodel
Contain all information about a performance model. At least the type and symbol fields have to be filled when defining a performance model for a codelet. For compatibility, make sure to initialize the whole structure to zero, either by using explicit memset, or by letting the compiler implicitly do it in e.g. static storage case. If not provided, other fields have to be zero.
starpu_perfmodel_arch
todo
starpu_perfmodel_device
todo
starpu_perfmodel_history_entry
todo
starpu_perfmodel_history_list
todo
starpu_perfmodel_history_table
starpu_perfmodel_per_arch
information about the performance model of a given arch.
starpu_perfmodel_regression_model
todo
starpu_prof_tool_api_info
API info
starpu_prof_tool_info
General information
starpu_profiling_bus_info
todo
starpu_profiling_task_info
Information about the execution of a task. It is accessible from the field starpu_task::profiling_info if profiling was enabled.
starpu_profiling_worker_info
Profiling information associated to a worker. The timing is provided since the previous call to starpu_profiling_worker_get_info().
starpu_pthread_spinlock_t
starpu_sched_ctx_iterator
Structure needed to iterate on the collection
starpu_sched_policy
Contain all the methods that implement a scheduling policy. An application may specify which scheduling strategy in the field starpu_conf::sched_policy passed to the function starpu_init().
starpu_task
@defgroup API_Task_Bundles Task Bundles @{
starpu_task_list
Store a double-chained list of tasks
starpu_tensor_interface
Tensor interface for 4D dense tensors
starpu_transaction
starpu_tree
todo
starpu_variable_interface
Variable interface for a single data (not a vector, a matrix, a list, …)
starpu_vector_interface
todo
starpu_worker_collection
A scheduling context manages a collection of workers that can be memorized using different data structures. Thus, a generic structure is available in order to simplify the choice of its type. Only the list data structure is available but further data structures(like tree) implementations are foreseen.

Constants§

CL_TARGET_OPENCL_VERSION
HAVE_MPI_COMM_F2C
STARPU_ACCESS_MODE_MAX
< The purpose of ::STARPU_ACCESS_MODE_MAX is to be the maximum of this enum.
STARPU_ACQUIRE_NO_NODE
STARPU_ACQUIRE_NO_NODE_LOCK_ALL
STARPU_ANY_WORKER
< any worker, used in the hypervisor
STARPU_BACKTRACE_LENGTH
STARPU_BCSR_GET_OFFSET
STARPU_BCSR_INTERFACE_ID
< Identifier for the BCSR data interface
STARPU_BLOCK_INTERFACE_ID
< Identifier for the block data interface
STARPU_BUBBLE_FUNC
STARPU_BUBBLE_FUNC_ARG
STARPU_BUBBLE_GEN_DAG_FUNC
STARPU_BUBBLE_GEN_DAG_FUNC_ARG
STARPU_BUBBLE_PARENT
STARPU_CACHELINE_SIZE
STARPU_CALLBACK
STARPU_CALLBACK_ARG
STARPU_CALLBACK_ARG_NFREE
STARPU_CALLBACK_WITH_ARG
STARPU_CALLBACK_WITH_ARG_NFREE
STARPU_CLUSTER_AWAKE_WORKERS
STARPU_CLUSTER_CREATE_FUNC
STARPU_CLUSTER_CREATE_FUNC_ARG
STARPU_CLUSTER_INTEL_OPENMP_MKL
< deprecated
STARPU_CLUSTER_KEEP_HOMOGENEOUS
STARPU_CLUSTER_MAX_NB
STARPU_CLUSTER_MIN_NB
STARPU_CLUSTER_NB
STARPU_CLUSTER_NCORES
STARPU_CLUSTER_NEW
STARPU_CLUSTER_OPENMP
< deprecated
STARPU_CLUSTER_PARTITION_ONE
STARPU_CLUSTER_POLICY_NAME
STARPU_CLUSTER_POLICY_STRUCT
STARPU_CLUSTER_PREFERE_MIN
STARPU_CLUSTER_TYPE
STARPU_CL_ARGS
STARPU_CL_ARGS_NFREE
STARPU_CODELET_NOPLANS
STARPU_CODELET_SIMGRID_EXECUTE
STARPU_CODELET_SIMGRID_EXECUTE_AND_INJECT
STARPU_COMMON
< Application-provided common cost model function, with per-arch factor
STARPU_COMMUTE
< ::STARPU_COMMUTE can be passed along ::STARPU_W or ::STARPU_RW to express that StarPU can let tasks commute, which is useful e.g. when bringing a contribution into some data, which can be done in any order (but still require sequential consistency against reads or non-commutative writes).
STARPU_COO_GET_OFFSET
STARPU_COO_INTERFACE_ID
< Identifier for the COO data interface
STARPU_CPU_RAM
< CPU core
STARPU_CPU_WORKER
< CPU core
STARPU_CSR_GET_OFFSET
STARPU_CSR_INTERFACE_ID
< Identifier for the CSR data interface
STARPU_CUDA_ASYNC
STARPU_CUDA_RAM
< NVIDIA CUDA device
STARPU_CUDA_WORKER
< NVIDIA CUDA device
STARPU_DATA_ARRAY
STARPU_DATA_MODE_ARRAY
STARPU_DEFAULT_PRIO
STARPU_DISK_RAM
< Disk memory
STARPU_DISK_SIZE_MIN
STARPU_EPILOGUE_CALLBACK
STARPU_EPILOGUE_CALLBACK_ARG
STARPU_EXECUTE_ON_DATA
STARPU_EXECUTE_ON_NODE
STARPU_EXECUTE_ON_WORKER
STARPU_EXECUTE_WHERE
STARPU_FETCH
A task really needs it now!
STARPU_FLOPS
STARPU_FORKJOIN
< for a parallel task whose threads are started by the codelet function, which has to use starpu_combined_worker_get_size() to determine how many threads should be started.
STARPU_FXT_MAX_FILES
STARPU_HANDLES_SEQUENTIAL_CONSISTENCY
STARPU_HAVE_ATOMIC_COMPARE_EXCHANGE_N
STARPU_HAVE_ATOMIC_COMPARE_EXCHANGE_N_8
STARPU_HAVE_ATOMIC_EXCHANGE_N
STARPU_HAVE_ATOMIC_EXCHANGE_N_8
STARPU_HAVE_ATOMIC_FETCH_ADD
STARPU_HAVE_ATOMIC_FETCH_ADD_8
STARPU_HAVE_ATOMIC_FETCH_OR
STARPU_HAVE_ATOMIC_FETCH_OR_8
STARPU_HAVE_ATOMIC_TEST_AND_SET
STARPU_HAVE_BLAS
STARPU_HAVE_CBLAS_H
STARPU_HAVE_CXX11
STARPU_HAVE_FFTW
STARPU_HAVE_FFTWF
STARPU_HAVE_FFTWL
STARPU_HAVE_GLPK_H
STARPU_HAVE_HELGRIND_H
STARPU_HAVE_HWLOC
STARPU_HAVE_LIBNUMA
STARPU_HAVE_MALLOC_H
STARPU_HAVE_MEMALIGN
STARPU_HAVE_MEMCHECK_H
STARPU_HAVE_MPI_COMM_CREATE_GROUP
STARPU_HAVE_NEARBYINTF
STARPU_HAVE_POSIX_MEMALIGN
STARPU_HAVE_PTHREAD_BARRIER
STARPU_HAVE_PTHREAD_SETNAME_NP
STARPU_HAVE_PTHREAD_SPIN_LOCK
STARPU_HAVE_RINTF
STARPU_HAVE_SETENV
STARPU_HAVE_STATEMENT_EXPRESSIONS
STARPU_HAVE_STRERROR_R
STARPU_HAVE_STRUCT_TIMESPEC
STARPU_HAVE_SYNC_BOOL_COMPARE_AND_SWAP
STARPU_HAVE_SYNC_BOOL_COMPARE_AND_SWAP_8
STARPU_HAVE_SYNC_FETCH_AND_ADD
STARPU_HAVE_SYNC_FETCH_AND_ADD_8
STARPU_HAVE_SYNC_FETCH_AND_OR
STARPU_HAVE_SYNC_FETCH_AND_OR_8
STARPU_HAVE_SYNC_LOCK_TEST_AND_SET
STARPU_HAVE_SYNC_SYNCHRONIZE
STARPU_HAVE_SYNC_VAL_COMPARE_AND_SWAP
STARPU_HAVE_SYNC_VAL_COMPARE_AND_SWAP_8
STARPU_HAVE_UNISTD_H
STARPU_HAVE_UNSETENV
STARPU_HAVE_VALGRIND_H
STARPU_HAVE_X11
STARPU_HIP_ASYNC
STARPU_HIP_RAM
< NVIDIA/AMD HIP device
STARPU_HIP_WORKER
< NVIDIA/AMD HIP device
STARPU_HISTORY_BASED
< Automatic history-based cost model
STARPU_HYPERVISOR_TAG
STARPU_IDLEFETCH
Get this here when you have time to
STARPU_LINUX_SYS
STARPU_LOCALITY
< used to tell the scheduler which data is the most important for the task, and should thus be used to try to group tasks on the same core or cache, etc. For now only the ws and lws schedulers take this flag into account, and only when rebuild with \c USE_LOCALITY flag defined in the src/sched_policies/work_stealing_policy.c source code.
STARPU_MAIN_RAM
STARPU_MAJOR_VERSION
STARPU_MALLOC_COUNT
STARPU_MALLOC_NORECLAIM
STARPU_MALLOC_PINNED
STARPU_MALLOC_SIMULATION_FOLDED
STARPU_MALLOC_SIMULATION_UNIQUE
STARPU_MATRIX_INTERFACE_ID
< Identifier for the matrix data interface
STARPU_MAXCPUS
STARPU_MAXCUDADEVS
STARPU_MAXHIPDEVS
STARPU_MAXIMPLEMENTATIONS
STARPU_MAXMAXFPGADEVS
STARPU_MAXNODES
STARPU_MAXNUMANODES
STARPU_MAXOPENCLDEVS
STARPU_MAX_FPGA_RAM
< Maxeler FPGA device
STARPU_MAX_FPGA_WORKER
< Maxeler FPGA device
STARPU_MAX_INTERFACE_ID
< Maximum number of data interfaces
STARPU_MAX_RAM
< Maximum value of memory types
STARPU_MEMORY_OVERFLOW
STARPU_MEMORY_WAIT
STARPU_MINOR_VERSION
STARPU_MODE_SHIFT
STARPU_MPI_MS_RAM
< MPI Slave device
STARPU_MPI_MS_WORKER
< MPI Slave device
STARPU_MPI_REDUX
< Inter-node reduction only. This is similar to ::STARPU_REDUX, except that StarPU will allocate a per-node buffer only, i.e. parallelism will be achieved between nodes, but not within each node. This is useful when the per-worker buffers allocated with ::STARPU_REDUX consume too much memory.
STARPU_MULTIFORMAT_INTERFACE_ID
< Identifier for the multiformat data interface
STARPU_MULTIPLE_REGRESSION_BASED
< Automatic multiple linear regression-based cost model. Application provides parameters, their combinations and exponents.
STARPU_NAME
STARPU_NARCH
< Number of arch types
STARPU_NDIM_INTERFACE_ID
< Identifier for the ndim array data interface
STARPU_NFETCH
Get this here when you have time to
STARPU_NL_REGRESSION_BASED
< Automatic non-linear regression-based cost model (a * size ^ b + c)
STARPU_NMAXBUFS
STARPU_NMAXWORKERS
STARPU_NMAX_SCHED_CTXS
STARPU_NODE_SELECTION_POLICY
STARPU_NOFOOTPRINT
< Ignore this data for the footprint computation. See \ref ScratchData
STARPU_NONE
< todo
STARPU_NON_BLOCKING_DRIVERS
STARPU_NOPLAN
< Disable automatic submission of asynchronous partitioning/unpartitioning, only use internally by StarPU
STARPU_NOWHERE
STARPU_NRAM
< Number of memory types
STARPU_NS_PER_S
STARPU_OPENCL_ASYNC
STARPU_OPENCL_RAM
< OpenCL device
STARPU_OPENCL_WORKER
< OpenCL device
STARPU_OPENGL_RENDER
STARPU_OPENMP
STARPU_PAPI
STARPU_PARALLEL_WORKER
STARPU_PARALLEL_WORKER_AWAKE_WORKERS
STARPU_PARALLEL_WORKER_CREATE_FUNC
STARPU_PARALLEL_WORKER_CREATE_FUNC_ARG
STARPU_PARALLEL_WORKER_GNU_OPENMP_MKL
< todo
STARPU_PARALLEL_WORKER_INTEL_OPENMP_MKL
< todo
STARPU_PARALLEL_WORKER_KEEP_HOMOGENEOUS
STARPU_PARALLEL_WORKER_MAX_NB
STARPU_PARALLEL_WORKER_MIN_NB
STARPU_PARALLEL_WORKER_NB
STARPU_PARALLEL_WORKER_NCORES
STARPU_PARALLEL_WORKER_NEW
STARPU_PARALLEL_WORKER_OPENMP
< todo
STARPU_PARALLEL_WORKER_PARTITION_ONE
STARPU_PARALLEL_WORKER_POLICY_NAME
STARPU_PARALLEL_WORKER_POLICY_STRUCT
STARPU_PARALLEL_WORKER_PREFERE_MIN
STARPU_PARALLEL_WORKER_TYPE
STARPU_PERFMODEL_INVALID
STARPU_PER_ARCH
< Application-provided per-arch cost model function
STARPU_PER_WORKER
< Application-provided per-worker cost model function
STARPU_POSSIBLY_PARALLEL
STARPU_PREFETCH
It is a good idea to have it asap
STARPU_PRIORITY
STARPU_PROFILING_DISABLE
STARPU_PROFILING_ENABLE
STARPU_PROF_TOOL
STARPU_PROLOGUE_CALLBACK
STARPU_PROLOGUE_CALLBACK_ARG
STARPU_PROLOGUE_CALLBACK_ARG_NFREE
STARPU_PROLOGUE_CALLBACK_POP
STARPU_PROLOGUE_CALLBACK_POP_ARG
STARPU_PROLOGUE_CALLBACK_POP_ARG_NFREE
STARPU_PTHREAD_BARRIER_SERIAL_THREAD
STARPU_PTHREAD_COND_INITIALIZER_ZERO
STARPU_PTHREAD_MUTEX_INITIALIZER_ZERO
STARPU_PTHREAD_RWLOCK_INITIALIZER_ZERO
STARPU_PYTHON_HAVE_NUMPY
STARPU_QUICK_CHECK
STARPU_R
< read-only mode
STARPU_REDUX
< Reduction mode. StarPU will allocate on the fly a per-worker buffer, so that various tasks that access the same data in ::STARPU_REDUX mode can execute in parallel. When a task accesses the data without ::STARPU_REDUX, StarPU will automatically reduce the different contributions.
STARPU_REGRESSION_BASED
< Automatic linear regression-based cost model (alpha * size ^ beta)
STARPU_RELEASE_VERSION
STARPU_RW
< read-write mode. Equivalent to ::STARPU_R|::STARPU_W
STARPU_SCHED_CTX
STARPU_SCHED_CTX_AWAKE_WORKERS
STARPU_SCHED_CTX_CUDA_NSMS
STARPU_SCHED_CTX_HIERARCHY_LEVEL
STARPU_SCHED_CTX_NESTED
STARPU_SCHED_CTX_POLICY_INIT
STARPU_SCHED_CTX_POLICY_MAX_PRIO
STARPU_SCHED_CTX_POLICY_MIN_PRIO
STARPU_SCHED_CTX_POLICY_NAME
STARPU_SCHED_CTX_POLICY_STRUCT
STARPU_SCHED_CTX_SUB_CTXS
STARPU_SCHED_CTX_USER_DATA
STARPU_SCRATCH
< A temporary buffer is allocated for the task, but StarPU does not enforce data consistency—i.e. each device has its own buffer, independently from each other (even for CPUs), and no data transfer is ever performed. This is useful for temporary variables to avoid allocating/freeing buffers inside each task. Currently, no behavior is defined concerning the relation with the ::STARPU_R and ::STARPU_W modes and the value provided at registration — i.e., the value of the scratch buffer is undefined at entry of the codelet function. It is being considered for future extensions at least to define the initial value. For now, data to be used in ::STARPU_SCRATCH mode should be registered with node -1 and a NULL pointer, since the value of the provided buffer is simply ignored for now.
STARPU_SEQ
< (default) for classical sequential tasks.
STARPU_SEQUENTIAL_CONSISTENCY
STARPU_SHIFTED_MODE_MAX
STARPU_SPECIFIC_NODE_CPU
STARPU_SPECIFIC_NODE_FAST
STARPU_SPECIFIC_NODE_LOCAL
STARPU_SPECIFIC_NODE_LOCAL_OR_CPU
STARPU_SPECIFIC_NODE_NONE
STARPU_SPECIFIC_NODE_SLOW
STARPU_SPMD
< for a parallel task whose threads are handled by StarPU, the code has to use starpu_combined_worker_get_size() and starpu_combined_worker_get_rank() to distribute the work.
STARPU_SSEND
< used in starpu_mpi_task_insert() to specify the data has to be sent using a synchronous and non-blocking mode (see starpu_mpi_issend())
STARPU_SYSTEM_BLAS
STARPU_TAG
STARPU_TAG_ONLY
STARPU_TASK_BLOCKED
< The task has just been submitted, and its dependencies has not been checked yet.
STARPU_TASK_BLOCKED_ON_DATA
< The task is waiting for some data.
STARPU_TASK_BLOCKED_ON_TAG
< The task is waiting for a tag.
STARPU_TASK_BLOCKED_ON_TASK
< The task is waiting for a task.
STARPU_TASK_COLOR
STARPU_TASK_DEPS_ARRAY
STARPU_TASK_END_DEP
STARPU_TASK_END_DEPS_ARRAY
STARPU_TASK_FILE
STARPU_TASK_FINISHED
< The task is finished executing.
STARPU_TASK_INIT
< The task has just been initialized.
STARPU_TASK_INVALID
STARPU_TASK_LINE
STARPU_TASK_NO_SUBMITORDER
STARPU_TASK_PREFETCH
A task will need it soon
STARPU_TASK_PROFILING_INFO
STARPU_TASK_READY
< The task is ready for execution.
STARPU_TASK_RUNNING
< The task is running on some worker.
STARPU_TASK_SCHED_DATA
STARPU_TASK_STOPPED
< The task is stopped.
STARPU_TASK_SYNCHRONOUS
STARPU_TASK_TYPE_DATA_ACQUIRE
STARPU_TASK_TYPE_INTERNAL
STARPU_TASK_TYPE_NORMAL
STARPU_TASK_WORKERIDS
STARPU_TCPIP_MS_RAM
< TCPIP Slave device
STARPU_TCPIP_MS_WORKER
< TCPIP Slave device
STARPU_TENSOR_INTERFACE_ID
< Identifier for the tensor data interface
STARPU_THREAD_ACTIVE
STARPU_TRANSACTION
STARPU_UNKNOWN_INTERFACE_ID
< Unknown interface
STARPU_UNMAP
< Request unmapping the destination replicate, only use internally by StarPU
STARPU_UNUSED
STARPU_USE_CPU
STARPU_USE_DRAND48
STARPU_USE_ERAND48_R
STARPU_USE_FXT
STARPU_USE_MPI
STARPU_USE_MPI_MPI
STARPU_USE_OPENCL
STARPU_USE_TCPIP_MASTER_SLAVE
STARPU_VALUE
STARPU_VARIABLE_INTERFACE_ID
< Identifier for the variable data interface
STARPU_VARIABLE_NBUFFERS
STARPU_VECTOR_INTERFACE_ID
< Identifier for the vector data interface
STARPU_VOID_INTERFACE_ID
< Identifier for the void data interface
STARPU_W
< write-only mode
STARPU_WORKER_LIST
< The collection is an array
STARPU_WORKER_ORDER
STARPU_WORKER_TREE
< The collection is a tree
starpu_omp_proc_bind_close
< Assign every thread in the team to a place \b close to the parent thread.
starpu_omp_proc_bind_false
< Team threads may be moved between places at any time.
starpu_omp_proc_bind_master
< Assign every thread in the team to the same place as the \b master thread.
starpu_omp_proc_bind_spread
< Assign team threads as a sparse distribution over the selected places.
starpu_omp_proc_bind_true
< Team threads may not be moved between places.
starpu_omp_proc_bind_undefined
< Undefined processor binding method.
starpu_omp_sched_auto
< \b Automatically chosen iteration scheduling algorithm.
starpu_omp_sched_dynamic
< \b Dynamic iteration scheduling algorithm.
starpu_omp_sched_guided
< \b Guided iteration scheduling algorithm.
starpu_omp_sched_runtime
< Choice of iteration scheduling algorithm deferred at \b runtime.
starpu_omp_sched_static
< \b Static iteration scheduling algorithm.
starpu_omp_sched_undefined
< Undefined iteration scheduling algorithm.
starpu_perf_counter_scope_global
< global scope
starpu_perf_counter_scope_per_codelet
< per-codelet scope
starpu_perf_counter_scope_per_worker
< per-worker scope
starpu_perf_counter_scope_undefined
< undefined scope
starpu_perf_counter_type_double
< 64-bit double precision floating-point value
starpu_perf_counter_type_float
< 32-bit single precision floating-point value
starpu_perf_counter_type_int32
< signed 32-bit integer value
starpu_perf_counter_type_int64
< signed 64-bit integer value
starpu_perf_counter_type_undefined
< undefined value type
starpu_perf_knob_scope_global
< global scope
starpu_perf_knob_scope_per_scheduler
< per-scheduler scope
starpu_perf_knob_scope_per_worker
< per-worker scope
starpu_perf_knob_scope_undefined
< undefined scope
starpu_perf_knob_type_double
< 64-bit double precision floating-point value
starpu_perf_knob_type_float
< 32-bit single precision floating-point value
starpu_perf_knob_type_int32
< signed 32-bit integer value
starpu_perf_knob_type_int64
< signed 64-bit integer value
starpu_perf_knob_type_undefined
< undefined value type
starpu_prof_tool_command_reg
starpu_prof_tool_command_toggle
starpu_prof_tool_command_toggle_per_thread
starpu_prof_tool_driver_cpu
starpu_prof_tool_driver_gpu
starpu_prof_tool_driver_hip
starpu_prof_tool_driver_ocl
starpu_prof_tool_event_driver_deinit
starpu_prof_tool_event_driver_init
starpu_prof_tool_event_driver_init_end
starpu_prof_tool_event_driver_init_start
starpu_prof_tool_event_end_cpu_exec
starpu_prof_tool_event_end_gpu_exec
starpu_prof_tool_event_end_transfer
starpu_prof_tool_event_init
starpu_prof_tool_event_init_begin
starpu_prof_tool_event_init_end
starpu_prof_tool_event_none
starpu_prof_tool_event_start_cpu_exec
starpu_prof_tool_event_start_gpu_exec
starpu_prof_tool_event_start_transfer
starpu_prof_tool_event_terminate
starpu_prof_tool_event_user_end
starpu_prof_tool_event_user_start

Statics§

_starpu_silent
starpu_codelet_nop
Codelet with empty function defined for all drivers
starpu_disk_hdf5_ops
Use the HDF5 library.
starpu_disk_leveldb_ops
Use the leveldb created by Google. More information at https://code.google.com/p/leveldb/ Do not support asynchronous transfers.
starpu_disk_stdio_ops
Use the stdio library (fwrite, fread…) to read/write on disk.
starpu_disk_swap_node
Contain the node number of the disk swap, if set up through the \ref STARPU_DISK_SWAP variable.
starpu_disk_unistd_o_direct_ops
Use the unistd library (write, read…) to read/write on disk with the O_DIRECT flag.
starpu_disk_unistd_ops
Use the unistd library (write, read…) to read/write on disk.
starpu_interface_bcsr_ops
@name BCSR Data Interface @{
starpu_interface_block_ops
@name Block Data Interface @{
starpu_interface_coo_ops
@name Accessing COO Data Interfaces @{
starpu_interface_csr_ops
@name CSR Data Interface @{
starpu_interface_matrix_ops
@name Accessing Matrix Data Interfaces @{
starpu_interface_ndim_ops
@name Ndim Array Data Interface @{
starpu_interface_tensor_ops
@name Tensor Data Interface @{
starpu_interface_variable_ops
@name Variable Data Interface @{
starpu_interface_vector_ops
@name Vector Data Interface @{
starpu_interface_void_ops
@name Void Data Interface @{
starpu_perfmodel_nop
Performance model which just always return 1µs.
starpu_worker_list
starpu_worker_tree

Functions§

_starpu_worker_get_id_check
starpu_arbiter_create
Create a data access arbiter, see \ref ConcurrentDataAccess for the details
starpu_arbiter_destroy
Destroy the \p arbiter. This must only be called after all data assigned to it have been unregistered. See \ref ConcurrentDataAccess for the details.
starpu_arch_mask_to_worker_archtype
Convert a mask of architectures to a worker archtype. See \ref TopologyWorkers for more details.
starpu_asynchronous_copy_disabled
Return 1 if asynchronous data transfers between CPU and accelerators are disabled. See \ref Basic for more details.
starpu_asynchronous_copy_disabled_for
Return 1 if asynchronous data transfers with a given kind of memory are disabled.
starpu_asynchronous_cuda_copy_disabled
Return 1 if asynchronous data transfers between CPU and CUDA accelerators are disabled. See \ref cudaWorkers for more details.
starpu_asynchronous_hip_copy_disabled
Return 1 if asynchronous data transfers between CPU and HIP accelerators are disabled. See \ref hipWorkers for more details.
starpu_asynchronous_max_fpga_copy_disabled
Return 1 if asynchronous data transfers between CPU and Maxeler FPGA devices are disabled. See \ref maxfpgaWorkers for more details.
starpu_asynchronous_mpi_ms_copy_disabled
Return 1 if asynchronous data transfers between CPU and MPI Slave devices are disabled. See \ref mpimsWorkers for more details.
starpu_asynchronous_opencl_copy_disabled
Return 1 if asynchronous data transfers between CPU and OpenCL accelerators are disabled. See \ref openclWorkers for more details.
starpu_asynchronous_tcpip_ms_copy_disabled
Return 1 if asynchronous data transfers between CPU and TCP/IP Slave devices are disabled. See \ref tcpipmsWorkers for more details.
starpu_bcsr_data_register
This variant of starpu_data_register() uses the BCSR (Blocked Compressed Sparse Row Representation) sparse matrix interface. Register the sparse matrix made of \p nnz non-zero blocks of elements of size \p elemsize stored in \p nzval and initializes \p handle to represent it. Blocks have size \p r * \p c. \p nrow is the number of rows (in terms of blocks), \p colind is an array of nnz elements, colind[i] is the block-column index for block i in \p nzval, \p rowptr is an array of nrow+1 elements, rowptr[i] is the block-index (in \p nzval) of the first block of row i. By convention, rowptr[nrow] is the number of blocks, this allows an easier access of the matrix’s elements for the kernels. \p firstentry is the index of the first entry of the given arrays (usually 0 or 1).
starpu_bcsr_filter_canonical_block
Partition a block-sparse matrix into dense matrices. starpu_data_filter::get_child_ops needs to be set to starpu_bcsr_filter_canonical_block_child_ops() and starpu_data_filter::get_nchildren set to starpu_bcsr_filter_canonical_block_get_nchildren().
starpu_bcsr_filter_canonical_block_child_ops
Return the child_ops of the partition obtained with starpu_bcsr_filter_canonical_block(). See \ref BCSRDataInterface for more details.
starpu_bcsr_filter_canonical_block_get_nchildren
Return the number of children obtained with starpu_bcsr_filter_canonical_block(). See \ref BCSRDataInterface for more details.
starpu_bcsr_filter_vertical_block
Partition a block-sparse matrix into block-sparse matrices.
starpu_bcsr_get_c
Return the number of columns in a block.
starpu_bcsr_get_elemsize
Return the size of the elements in the matrix designated by \p handle.
starpu_bcsr_get_firstentry
Return the index at which all arrays (the column indexes, the row pointers…) of the matrix desginated by \p handle.
starpu_bcsr_get_local_colind
Return a pointer to the column index, which holds the positions of the non-zero entries in the matrix designated by \p handle.
starpu_bcsr_get_local_nzval
Return a pointer to the non-zero values of the matrix designated by \p handle.
starpu_bcsr_get_local_rowptr
Return the row pointer array of the matrix designated by \p handle.
starpu_bcsr_get_nnz
Return the number of non-zero elements in the matrix designated by \p handle.
starpu_bcsr_get_nrow
Return the number of rows (in terms of blocks of size r*c) in the matrix designated by \p handle.
starpu_bcsr_get_r
Return the number of rows in a block.
starpu_bind_thread_on
Bind the calling thread on the given \p cpuid (which should have been obtained with starpu_get_next_bindid()).
starpu_bind_thread_on_cpu
Bind the calling thread on the given \p cpuid
starpu_bind_thread_on_main
Bind the calling thread back to the core reserved for the main thread.
starpu_bind_thread_on_worker
Bind the calling thread on the cores corresponding to the \p workerid .
starpu_bindid_get_workerids
See \ref TopologyWorkers for more details.
starpu_block_data_register
Register the \p nx x \p ny x \p nz 3D matrix of \p elemsize byte elements pointed by \p ptr and initialize \p handle to represent it. Again, \p ldy and \p ldz specify the number of elements between rows and between z planes.
starpu_block_filter_block
Partition a block along the X dimension, thus getting (x/\p nparts ,y,z) 3D matrices. If \p nparts does not divide x, the last submatrix contains the remainder.
starpu_block_filter_block_shadow
Partition a block along the X dimension, with a shadow border starpu_data_filter::filter_arg_ptr, thus getting ((x-2shadow)/\p nparts +2shadow,y,z) blocks. If \p nparts does not divide x, the last submatrix contains the remainder.
starpu_block_filter_depth_block
Partition a block along the Z dimension, thus getting (x,y,z/\p nparts) blocks. If \p nparts does not divide z, the last submatrix contains the remainder.
starpu_block_filter_depth_block_shadow
Partition a block along the Z dimension, with a shadow border starpu_data_filter::filter_arg_ptr, thus getting (x,y,(z-2shadow)/\p nparts +2shadow) blocks. If \p nparts does not divide z, the last submatrix contains the remainder.
starpu_block_filter_pick_matrix_child_ops
Return the child_ops of the partition obtained with starpu_block_filter_pick_matrix_z() and starpu_block_filter_pick_matrix_y(). See \ref BlockDataInterface for more details.
starpu_block_filter_pick_matrix_y
Pick \p nparts contiguous matrices from a block along the Y dimension. The starting position on Y-axis is set in starpu_data_filter::filter_arg_ptr.
starpu_block_filter_pick_matrix_z
Pick \p nparts contiguous matrices from a block along the Z dimension. The starting position on Z-axis is set in starpu_data_filter::filter_arg_ptr.
starpu_block_filter_pick_variable
Pick \p nparts contiguous variables from a block. The starting position is set in starpu_data_filter::filter_arg_ptr.
starpu_block_filter_pick_variable_child_ops
Return the child_ops of the partition obtained with starpu_block_filter_pick_variable(). See \ref BlockDataInterface for more details.
starpu_block_filter_vertical_block
Partition a block along the Y dimension, thus getting (x,y/\p nparts ,z) blocks. If \p nparts does not divide y, the last submatrix contains the remainder.
starpu_block_filter_vertical_block_shadow
Partition a block along the Y dimension, with a shadow border starpu_data_filter::filter_arg_ptr, thus getting (x,(y-2shadow)/\p nparts +2shadow,z) 3D matrices. If \p nparts does not divide y, the last submatrix contains the remainder.
starpu_block_get_elemsize
Return the size of the elements of the block designated by \p handle.
starpu_block_get_local_ldy
Return the number of elements between each row of the block designated by \p handle, in the format of the current memory node.
starpu_block_get_local_ldz
Return the number of elements between each z plane of the block designated by \p handle, in the format of the current memory node.
starpu_block_get_local_ptr
Return the local pointer associated with \p handle.
starpu_block_get_nx
Return the number of elements on the x-axis of the block designated by \p handle.
starpu_block_get_ny
Return the number of elements on the y-axis of the block designated by \p handle.
starpu_block_get_nz
Return the number of elements on the z-axis of the block designated by \p handle.
starpu_block_ptr_register
Register into the \p handle that to store data on node \p node it should use the buffer located at \p ptr, or device handle \p dev_handle and offset \p offset (for OpenCL, notably), with \p ldy elements between rows and \p ldz elements between z planes.
starpu_bound_compute
Get theoretical upper bound (in ms) (needs glpk support detected by configure script). It returns 0 if some performance models are not calibrated. \p integer permits to choose between integer solving (which takes a long time but is correct), and relaxed solving (which provides an approximate solution).
starpu_bound_print
Emit on \p output the statistics of actual execution vs theoretical upper bound. \p integer permits to choose between integer solving (which takes a long time but is correct), and relaxed solving (which provides an approximate solution).
starpu_bound_print_dot
Emit the DAG that was recorded on \p output.
starpu_bound_print_lp
Emit the Linear Programming system on \p output for the recorded tasks, in the lp format
starpu_bound_print_mps
Emit the Linear Programming system on \p output for the recorded tasks, in the mps format
starpu_bound_start
Start recording tasks (resets stats). \p deps tells whether dependencies should be recorded too (this is quite expensive)
starpu_bound_stop
Stop recording tasks
starpu_bus_get_count
Return the number of buses in the machine. See \ref HardwareTopology for more details.
starpu_bus_get_direct
See \ref HardwareTopology for more details.
starpu_bus_get_dst
Return the destination point of bus \p busid. See \ref HardwareTopology for more details.
starpu_bus_get_id
Return the identifier of the bus between \p src and \p dst. See \ref HardwareTopology for more details.
starpu_bus_get_ngpus
See \ref HardwareTopology for more details.
starpu_bus_get_profiling_info
See _starpu_profiling_bus_helper_display_summary in src/profiling/profiling_helpers.c for a usage example. Note that calling starpu_bus_get_profiling_info() resets the counters to zero. See \ref FeedBackFigures for more details.
starpu_bus_get_src
Return the source point of bus \p busid. See \ref HardwareTopology for more details.
starpu_bus_print_affinity
Print the affinity devices on \p f.
starpu_bus_print_bandwidth
Print a matrix of bus bandwidths on \p f.
starpu_bus_print_filenames
Print on \p f the name of the files containing the matrix of bus bandwidths, the affinity devices and the latency.
starpu_bus_set_direct
See \ref HardwareTopology for more details.
starpu_bus_set_ngpus
See \ref HardwareTopology for more details.
starpu_cluster_machine
@deprecated Use starpu_parallel_worker_init()
starpu_cluster_print
@deprecated Use starpu_parallel_worker_print()
starpu_codelet_display_stats
Output on \c stderr some statistics on the codelet \p cl. See \ref Per-codeletFeedback for more details.
starpu_codelet_dup_arg
Unpack the next argument of unknown size from \p state into \p ptr with a copy. \p ptr is allocated before copying in it the value of the argument. The size of the argument is returned in \p size. \p has to be initialized before with starpu_codelet_unpack_arg_init(). See \ref InsertTaskUtility for more details.
starpu_codelet_init
Initialize \p cl with default values. Codelets should preferably be initialized statically as shown in \ref DefiningACodelet. However such a initialisation is not always possible, e.g. when using C++. See \ref DefiningACodelet for more details.
starpu_codelet_pack_arg
Pack one argument into struct starpu_codelet_pack_arg \p state. That structure has to be initialized before with starpu_codelet_pack_arg_init(), and after all starpu_codelet_pack_arg() calls performed, starpu_codelet_pack_arg_fini() has to be used to get the \p cl_arg and \p cl_arg_size to be put in the task. See \ref InsertTaskUtility for more details.
starpu_codelet_pack_arg_fini
Finish packing data, after calling starpu_codelet_pack_arg_init() once and starpu_codelet_pack_arg() several times. See \ref InsertTaskUtility for more details.
starpu_codelet_pack_arg_init
Initialize struct starpu_codelet_pack_arg before calling starpu_codelet_pack_arg() and starpu_codelet_pack_arg_fini(). This will simply initialize the content of the structure. See \ref InsertTaskUtility for more details.
starpu_codelet_pack_args
Pack arguments of type ::STARPU_VALUE into a buffer which can be given to a codelet and later unpacked with the function starpu_codelet_unpack_args().
starpu_codelet_pick_arg
Unpack the next argument of unknown size from \p state into \p ptr. \p ptr will be a pointer to the memory of the argument. The size of the argument is returned in \p size. \p has to be initialized before with starpu_codelet_unpack_arg_init(). See \ref InsertTaskUtility for more details.
starpu_codelet_unpack_arg
Unpack the next argument of size \p size from \p state into \p ptr with a copy. \p state has to be initialized before with starpu_codelet_unpack_arg_init(). See \ref InsertTaskUtility for more details.
starpu_codelet_unpack_arg_fini
Finish unpacking data, after calling starpu_codelet_unpack_arg_init() once and starpu_codelet_unpack_arg() or starpu_codelet_dup_arg() or starpu_codelet_pick_arg() several times. See \ref InsertTaskUtility for more details.
starpu_codelet_unpack_arg_init
Initialize \p state with \p cl_arg and \p cl_arg_size. This has to be called before calling starpu_codelet_unpack_arg(). See \ref InsertTaskUtility for more details.
starpu_codelet_unpack_args
Retrieve the arguments of type ::STARPU_VALUE associated to a task automatically created using the function starpu_task_insert(). If any parameter’s value is 0, unpacking will stop there and ignore the remaining parameters. See \ref InsertTaskUtility for more details.
starpu_codelet_unpack_args_and_copyleft
Similar to starpu_codelet_unpack_args(), but if any parameter is 0, copy the part of \p cl_arg that has not been read in \p buffer which can then be used in a later call to one of the unpack functions. See \ref InsertTaskUtility for more details.
starpu_codelet_unpack_discard_arg
Call this function during unpacking to skip saving the argument in ptr. See \ref InsertTaskUtility for more details.
starpu_combined_worker_assign_workerid
Register a new combined worker and get its identifier. See \ref SchedulingHelpers for more details.
starpu_combined_worker_can_execute_task
Variant of starpu_worker_can_execute_task() compatible with combined workers. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_combined_worker_get_count
Return the number of different combined workers. See \ref SchedulingHelpers for more details.
starpu_combined_worker_get_description
Get the description of a combined worker. See \ref SchedulingHelpers for more details.
starpu_combined_worker_get_id
Return the identifier of the current combined worker. See \ref SchedulingHelpers for more details.
starpu_combined_worker_get_rank
Return the rank of the current thread within the combined worker. Can only be used in ::STARPU_SPMD parallel tasks, to know which part of the task to work on. See \ref SPMD-modeParallelTasks for more details.
starpu_combined_worker_get_size
Return the size of the current combined worker, i.e. the total number of CPUS running the same task in the case of ::STARPU_SPMD parallel tasks, or the total number of threads that the task is allowed to start in the case of ::STARPU_FORKJOIN parallel tasks. See \ref Fork-modeParallelTasks and \ref SPMD-modeParallelTasks for more details.
starpu_conf_init
Initialize the \p conf structure with the default values. In case some configuration parameters are already specified through environment variables, starpu_conf_init() initializes the fields of \p conf according to the environment variables. For instance if \ref STARPU_CALIBRATE is set, its value is put in the field starpu_conf::calibrate of \p conf. Upon successful completion, this function returns 0. Otherwise, -EINVAL indicates that the argument was NULL.
starpu_conf_noworker
Set fields of \p conf so that no worker is enabled, i.e. set starpu_conf::ncpus = 0, starpu_conf::ncuda = 0, etc.
starpu_coo_data_register
Register the \p nx x \p ny 2D matrix given in the COO format, using the \p columns, \p rows, \p values arrays, which must have \p n_values elements of size \p elemsize. Initialize \p handleptr. See \ref COODataInterface for more details.
starpu_cpu_os_index
Return the OS number of a given \p cpuid
starpu_cpu_worker_get_count
Return the number of CPUs controlled by StarPU. The return value should be at most \ref STARPU_MAXCPUS. See \ref TopologyWorkers for more details.
starpu_create_callback_task
Create and submit an empty task with the given callback. See \ref SynchronizationTasks for more details.
starpu_create_sync_task
Create and submit an empty task that unlocks a tag once all its dependencies are fulfilled. See \ref SynchronizationTasks for more details.
starpu_csr_data_register
Register a CSR (Compressed Sparse Row Representation) sparse matrix. See \ref CSRDataInterface for more details.
starpu_csr_filter_vertical_block
Partition a block-sparse matrix into vertical block-sparse matrices.
starpu_csr_get_elemsize
Return the size of the elements registered into the matrix designated by \p handle.
starpu_csr_get_firstentry
Return the index at which all arrays (the column indexes, the row pointers…) of the matrix designated by \p handle.
starpu_csr_get_local_colind
Return a local pointer to the column index of the matrix designated by \p handle.
starpu_csr_get_local_nzval
Return a local pointer to the non-zero values of the matrix designated by \p handle.
starpu_csr_get_local_rowptr
Return a local pointer to the row pointer array of the matrix designated by \p handle.
starpu_csr_get_nnz
Return the number of non-zero values in the matrix designated by \p handle.
starpu_csr_get_nrow
Return the size of the row pointer array of the matrix designated by \p handle.
starpu_cublas_init
Initialize CUBLAS on every CUDA device. The CUBLAS library must be initialized prior to any CUBLAS call. Calling starpu_cublas_init() will initialize CUBLAS on every CUDA device controlled by StarPU. This call blocks until CUBLAS has been properly initialized on every device. See \ref CUDA-specificOptimizations for more details.
starpu_cublas_set_stream
Set the proper CUBLAS stream for CUBLAS v1. This must be called from the CUDA codelet before calling CUBLAS v1 kernels, so that they are queued on the proper CUDA stream. When using one thread per CUDA worker, this function does not do anything since the CUBLAS stream does not change, and is set once by starpu_cublas_init(). See \ref CUDA-specificOptimizations for more details.
starpu_cublas_shutdown
Synchronously deinitialize the CUBLAS library on every CUDA device. See \ref CUDA-specificOptimizations for more details.
starpu_cuda_worker_get_count
Return the number of CUDA devices controlled by StarPU. The return value should be at most \ref STARPU_MAXCUDADEVS. See \ref TopologyWorkers for more details.
starpu_cusparse_init
Initialize CUSPARSE on every CUDA device controlled by StarPU. This call blocks until CUSPARSE has been properly initialized on every device. See \ref CUDA-specificOptimizations for more details.
starpu_cusparse_shutdown
Synchronously deinitialize the CUSPARSE library on every CUDA device. See \ref CUDA-specificOptimizations for more details.
starpu_data_acquire
The application must call this function prior to accessing registered data from main memory outside tasks. StarPU ensures that the application will get an up-to-date copy of \p handle in main memory located where the data was originally registered, and that all concurrent accesses (e.g. from tasks) will be consistent with the access mode specified with \p mode. starpu_data_release() must be called once the application no longer needs to access the piece of data. Note that implicit data dependencies are also enforced by starpu_data_acquire(), i.e. starpu_data_acquire() will wait for all tasks scheduled to work on the data, unless they have been disabled explicitly by calling starpu_data_set_default_sequential_consistency_flag() or starpu_data_set_sequential_consistency_flag(). starpu_data_acquire() is a blocking call, so that it cannot be called from tasks or from their callbacks (in that case, starpu_data_acquire() returns -EDEADLK). Upon successful completion, this function returns 0. See \ref DataAccess for more details.
starpu_data_acquire_cb
Asynchronous equivalent of starpu_data_acquire(). When the data specified in \p handle is available in the access \p mode, the \p callback function is executed. The application may access the requested data during the execution of \p callback. The \p callback function must call starpu_data_release() once the application no longer needs to access the piece of data. Note that implicit data dependencies are also enforced by starpu_data_acquire_cb() in case they are not disabled. Contrary to starpu_data_acquire(), this function is non-blocking and may be called from task callbacks. Upon successful completion, this function returns 0. See \ref DataAccess for more details.
starpu_data_acquire_cb_sequential_consistency
Similar to starpu_data_acquire_cb() with the possibility of enabling or disabling data dependencies. When the data specified in \p handle is available in the access \p mode, the \p callback function is executed. The application may access the requested data during the execution of this \p callback. The \p callback function must call starpu_data_release() once the application no longer needs to access the piece of data. Note that implicit data dependencies are also enforced by starpu_data_acquire_cb_sequential_consistency() in case they are not disabled specifically for the given \p handle or by the parameter \p sequential_consistency. Similarly to starpu_data_acquire_cb(), this function is non-blocking and may be called from task callbacks. Upon successful completion, this function returns 0. See \ref DataAccess for more details.
starpu_data_acquire_on_node
Similar to starpu_data_acquire(), except that the data will be available on the given memory node instead of main memory. ::STARPU_ACQUIRE_NO_NODE and ::STARPU_ACQUIRE_NO_NODE_LOCK_ALL can be used instead of an explicit node number. See \ref DataAccess for more details.
starpu_data_acquire_on_node_cb
Similar to starpu_data_acquire_cb(), except that the data will be available on the given memory node instead of main memory. ::STARPU_ACQUIRE_NO_NODE and ::STARPU_ACQUIRE_NO_NODE_LOCK_ALL can be used instead of an explicit node number. See \ref DataAccess for more details.
starpu_data_acquire_on_node_cb_sequential_consistency
Similar to starpu_data_acquire_cb_sequential_consistency(), except that the data will be available on the given memory node instead of main memory. ::STARPU_ACQUIRE_NO_NODE and ::STARPU_ACQUIRE_NO_NODE_LOCK_ALL can be used instead of an explicit node number. See \ref DataAccess for more details.
starpu_data_acquire_on_node_cb_sequential_consistency_sync_jobids
Similar to starpu_data_acquire_on_node_cb_sequential_consistency(), except that the \e pre_sync_jobid and \e post_sync_jobid parameters can be used to retrieve the jobid of the synchronization tasks. \e pre_sync_jobid happens just before the acquisition, and \e post_sync_jobid happens just after the release.
starpu_data_acquire_on_node_try
Similar to starpu_data_acquire_try(), except that the data will be available on the given memory node instead of main memory. ::STARPU_ACQUIRE_NO_NODE and ::STARPU_ACQUIRE_NO_NODE_LOCK_ALL can be used instead of an explicit node number. See \ref DataAccess for more details.
starpu_data_acquire_try
The application can call this function instead of starpu_data_acquire() so as to acquire the data like starpu_data_acquire(), but only if all previously-submitted tasks have completed, in which case starpu_data_acquire_try() returns 0. StarPU will have ensured that the application will get an up-to-date copy of \p handle in main memory located where the data was originally registered. starpu_data_release() must be called once the application no longer needs to access the piece of data. See \ref DataAccess for more details.
starpu_data_advise_as_important
Specify that the data \p handle can be discarded without impacting the application.
starpu_data_assign_arbiter
Make access to \p handle managed by \p arbiter, see \ref ConcurrentDataAccess for the details.
starpu_data_can_evict
Check whether data \p handle can be evicted now from node \p node. See \ref DataPrefetch for more details.
starpu_data_cpy
Copy the content of \p src_handle into \p dst_handle. The parameter \p asynchronous indicates whether the function should block or not. In the case of an asynchronous call, it is possible to synchronize with the termination of this operation either by the means of implicit dependencies (if enabled) or by calling starpu_task_wait_for_all(). If \p callback_func is not NULL, this callback function is executed after the handle has been copied, and it is given the pointer \p callback_arg as argument. See \ref DataHandlesHelpers for more details.
starpu_data_cpy_priority
Like starpu_data_cpy(), copy the content of \p src_handle into \p dst_handle, but additionally take a \p priority parameter to sort it among the whole task graph. See \ref DataHandlesHelpers for more details.
starpu_data_display_memory_stats
Display statistics about the current data handles registered within StarPU. StarPU must have been configured with the configure option \ref enable-memory-stats “–enable-memory-stats” (see \ref MemoryFeedback). See \ref MemoryFeedback for more details.
starpu_data_dup_ro
Create a copy of \p src_handle, and return a new handle in \p dst_handle, which is to be used only for read accesses. This allows StarPU to optimize it by not actually copying the data whenever possible (e.g. it may possibly simply return src_handle itself). The parameter \p asynchronous indicates whether the function should block or not. In the case of an asynchronous call, it is possible to synchronize with the termination of this operation either by the means of implicit dependencies (if enabled) or by calling starpu_task_wait_for_all(). If \p callback_func is not NULL, this callback function is executed after the handle has been copied, and it is given the pointer \p callback_arg as argument. See \ref DataHandlesHelpers for more details.
starpu_data_evict_from_node
Advise StarPU to evict \p handle from the memory node \p node StarPU will thus write its value back to its home node, before evicting it. This may however fail if e.g. some task is still working on it.
starpu_data_expected_transfer_time
Predict the transfer time (in micro-seconds) to move \p handle to a memory node. See \ref SchedulingHelpers for more details.
starpu_data_fetch_on_node
Issue a fetch request for the data \p handle to \p node, i.e. requests that the data be replicated to the given node as soon as possible, so that it is available there for tasks. If \p async is 0, the call will block until the transfer is achieved, else the call will return immediately, after having just queued the request. In the latter case, the request will asynchronously wait for the completion of any task writing on the data. See \ref DataPrefetch for more details.
starpu_data_get_alloc_size
Return the size of the allocated data associated with \p handle. See \ref DataHandlesHelpers for more details.
starpu_data_get_child
Return the \p i -th child of the given \p handle, which must have been partitioned beforehand. See \ref PartitioningData for more details.
starpu_data_get_coordinates_array
Get the coordinates of the data, as set by a previous call to starpu_data_set_coordinates_array() or starpu_data_set_coordinates() \p dimensions is the size of the \p dims array. This returns the actual number of returned coordinates. See \ref CreatingAGanttDiagram for more details.
starpu_data_get_default_sequential_consistency_flag
Return the default sequential consistency flag. See \ref SequentialConsistency for more details.
starpu_data_get_home_node
See \ref DataHandlesHelpers for more details.
starpu_data_get_interface_id
Return the unique identifier of the interface associated with the given \p handle. See \ref DefiningANewDataInterface_helpers for more details.
starpu_data_get_interface_on_node
Return the interface associated with \p handle on \p memory_node. See \ref DefiningANewDataInterface_pack for more details.
starpu_data_get_interface_ops
starpu_data_get_local_ptr
Return the local pointer associated with \p handle or NULL if \p handle’s interface does not have any data allocated locally. See \ref DataPointers for more details.
starpu_data_get_max_size
Return the maximum size that the \p handle data may need to increase to. See \ref DataHandlesHelpers for more details.
starpu_data_get_nb_children
Return the number of children \p handle has been partitioned into. See \ref PartitioningData for more details.
starpu_data_get_ooc_flag
Get whether this data was set to be elligible to be evicted to disk storage (1) or not (0). See \ref OOCDataRegistration for more details.
starpu_data_get_sched_data
Retrieve the field \c sched_data previously set for the \p handle. See \ref DataHandlesHelpers for more details.
starpu_data_get_sequential_consistency_flag
Get the data consistency mode associated to the data handle \p handle. See \ref SequentialConsistency for more details.
starpu_data_get_size
Return the size of the data associated with \p handle. See \ref DataHandlesHelpers for more details.
starpu_data_get_sub_data
After partitioning a StarPU data by applying a filter, starpu_data_get_sub_data() can be used to get handles for each of the data portions. \p root_data is the parent data that was partitioned. \p depth is the number of filters to traverse (in case several filters have been applied, to e.g. partition in row blocks, and then in column blocks), and the subsequent parameters are the indexes. The function returns a handle to the subdata.
starpu_data_get_user_data
Retrieve the field \c user_data previously set for the \p handle. See \ref DataHandlesHelpers for more details.
starpu_data_handle_to_pointer
Return the pointer associated with \p handle on node \p node or NULL if handle’s interface does not support this operation or data for this \p handle is not allocated on that \p node. See \ref DataPointers for more details.
starpu_data_idle_prefetch_on_node
Issue an idle prefetch request for the data \p handle to \p node, i.e. requests that the data be replicated to \p node, so that it is available there for tasks, but only when the bus is really idle. If \p async is 0, the call will block until the transfer is achieved, else the call will return immediately, after having just queued the request. In the latter case, the request will asynchronously wait for the completion of any task writing on the data. See \ref DataPrefetch for more details.
starpu_data_idle_prefetch_on_node_prio
See \ref DataPrefetch for more details.
starpu_data_interface_get_next_id
Return the next available id for a newly created data interface (\ref DefiningANewDataInterface).
starpu_data_invalidate
Destroy all replicates of the data \p handle immediately. After data invalidation, the first access to \p handle must be performed in ::STARPU_W mode. Accessing an invalidated data in ::STARPU_R mode results in undefined behaviour. See \ref DataManagementAllocation for more details.
starpu_data_invalidate_submit
Submit invalidation of the data \p handle after completion of previously submitted tasks. See \ref DataReduction for more details.
starpu_data_is_on_node
Check whether a valid copy of \p handle is currently available on memory node \p node (or a transfer request for getting so is ongoing). See \ref SchedulingHelpers for more details.
starpu_data_map_filters
Apply \p nfilters filters to the handle designated by \p root_handle recursively. \p nfilters pointers to variables of the type starpu_data_filter should be given. See \ref PartitioningData for more details.
starpu_data_map_filters_array
Apply \p nfilters filters to the handle designated by \p root_handle recursively. The list of filter \p filters of the type starpu_data_filter should be given. See \ref PartitioningData for more details.
starpu_data_map_filters_parray
Apply \p nfilters filters to the handle designated by \p root_handle recursively. The pointer of the filter list \p filters of the type starpu_data_filter should be given. See \ref PartitioningData for more details.
starpu_data_pack
Like starpu_data_pack_node(), but for the local memory node. See \ref DataHandlesHelpers for more details.
starpu_data_pack_node
Execute the packing operation of the interface of the data registered at \p handle (see starpu_data_interface_ops). This packing operation must allocate a buffer large enough at \p ptr on node \p node and copy into the newly allocated buffer the data associated to \p handle. \p count will be set to the size of the allocated buffer. If \p ptr is NULL, the function should not copy the data in the buffer but just set \p count to the size of the buffer which would have been allocated. The special value -1 indicates the size is yet unknown. See \ref DataHandlesHelpers for more details.
starpu_data_partition
Request the partitioning of \p initial_handle into several subdata according to the filter \p f.
starpu_data_partition_clean
Clear the partition planning established between \p root_data and \p children with starpu_data_partition_plan(). This will notably submit an unregister all the \p children, which can thus not be used any more afterwards. See \ref AsynchronousPartitioning for more details.
starpu_data_partition_clean_node
Similar to starpu_data_partition_clean() but the root data will be gathered on the given node. See \ref AsynchronousPartitioning for more details.
starpu_data_partition_plan
Plan to partition \p initial_handle into several subdata according to the filter \p f. The handles are returned into the \p children array, which has to be the same size as the number of parts described in \p f. These handles are not immediately usable, starpu_data_partition_submit() has to be called to submit the actual partitioning.
starpu_data_partition_readonly_downgrade_submit
Assume that a partitioning of \p initial_handle has already been submitted in read-write mode through starpu_data_partition_submit(), and will downgrade that partitioning into read-only mode for the \p children, fetching data back to the \p initial_handle, and adding the necessary dependencies. See \ref AsynchronousPartitioning for more details.
starpu_data_partition_readonly_submit
Similar to starpu_data_partition_submit(), but do not invalidate \p initial_handle. This allows to continue using it, but the application has to be careful not to write to \p initial_handle or \p children handles, only read from them, since the coherency is otherwise not guaranteed. This thus allows to submit various tasks which concurrently read from various partitions of the data.
starpu_data_partition_readonly_submit_sequential_consistency
Similar to starpu_data_partition_readonly_submit(), but allow to specify the coherency to be used for the main data \p initial_handle. See \ref AsynchronousPartitioning for more details.
starpu_data_partition_readwrite_upgrade_submit
Assume that a partitioning of \p initial_handle has already been submitted in readonly mode through starpu_data_partition_readonly_submit(), and will upgrade that partitioning into read-write mode for the \p children, by invalidating \p initial_handle, and adding the necessary dependencies. See \ref AsynchronousPartitioning for more details.
starpu_data_partition_submit
Submit the actual partitioning of \p initial_handle into the \p nparts \p children handles. This call is asynchronous, it only submits that the partitioning should be done, so that the \p children handles can now be used to submit tasks, and \p initial_handle can not be used to submit tasks any more (to guarantee coherency). For instance, \code{.c} starpu_data_partition_submit(A_handle, nslicesx, children); \endcode
starpu_data_partition_submit_sequential_consistency
Similar to starpu_data_partition_submit() but also allow to specify the coherency to be used for the main data \p initial_handle through the parameter \p sequential_consistency. See \ref AsynchronousPartitioning for more details.
starpu_data_peek
Read in handle’s local replicate the data located at \p ptr of size \p count as described by the interface of the data. The interface registered at \p handle must define a peeking operation (see starpu_data_interface_ops). See \ref DataHandlesHelpers for more details.
starpu_data_peek_node
Read in handle’s \p node replicate the data located at \p ptr of size \p count as described by the interface of the data. The interface registered at \p handle must define a peeking operation (see starpu_data_interface_ops). See \ref DataHandlesHelpers for more details.
starpu_data_prefetch_on_node
Issue a prefetch request for the data \p handle to \p node, i.e. requests that the data be replicated to \p node when there is room for it, so that it is available there for tasks. If \p async is 0, the call will block until the transfer is achieved, else the call will return immediately, after having just queued the request. In the latter case, the request will asynchronously wait for the completion of any task writing on the data. See \ref DataPrefetch for more details.
starpu_data_prefetch_on_node_prio
See \ref DataPrefetch for more details.
starpu_data_print
Print basic information on \p handle on \p node. See \ref DataHandlesHelpers for more details.
starpu_data_ptr_register
Register that a buffer for \p handle on \p node will be set. This is typically used by starpu_*_ptr_register helpers before setting the interface pointers for this node, to tell the core that that is now allocated. See \ref DefiningANewDataInterface_pointers for more details.
starpu_data_query_status
Same as starpu_data_query_status2(), but without the is_loading parameter. See \ref DataPrefetch for more details.
starpu_data_query_status2
Query the status of \p handle on the specified \p memory_node.
starpu_data_register
Register a piece of data into the handle located at the \p handleptr address. The \p data_interface buffer contains the initial description of the data in the \p home_node. The \p ops argument is a pointer to a structure describing the different methods used to manipulate this type of interface. See starpu_data_interface_ops for more details on this structure. If \p home_node is -1, StarPU will automatically allocate the memory when it is used for the first time in write-only mode. Once such data handle has been automatically allocated, it is possible to access it using any access mode. Note that StarPU supplies a set of predefined types of interface (e.g. vector or matrix) which can be registered by the means of helper functions (e.g. starpu_vector_data_register() or starpu_matrix_data_register()).
starpu_data_register_ops
Register the given data interface operations. If the field starpu_data_interface_ops::field is set to ::STARPU_UNKNOWN_INTERFACE_ID, then a new identifier will be set by calling starpu_data_interface_get_next_id(). The function is automatically called when registering a piece of data with starpu_data_register(). It is only necessary to call it beforehand for some specific cases (such as the usmaster slave mode).
starpu_data_register_same
Register a new piece of data into the handle \p handledst with the same interface as the handle \p handlesrc. See \ref DataHandlesHelpers for more details.
starpu_data_release
Release the piece of data acquired by the application either by starpu_data_acquire() or by starpu_data_acquire_cb(). See \ref DataAccess for more details.
starpu_data_release_on_node
Similar to starpu_data_release(), except that the data was made available on the given memory \p node instead of main memory. The \p node parameter must be exactly the same as the corresponding \c starpu_data_acquire_on_node* call. See \ref DataAccess for more details.
starpu_data_release_to
Partly release the piece of data acquired by the application either by starpu_data_acquire() or by starpu_data_acquire_cb(), switching the acquisition down to \p down_to_mode. For now, only releasing from ::STARPU_RW or ::STARPU_W acquisition down to ::STARPU_R is supported, or down to the same acquisition. ::STARPU_NONE can also be passed as \p down_to_mode, in which case this is equivalent to calling starpu_data_release(). See \ref DataAccess for more details.
starpu_data_release_to_on_node
Similar to starpu_data_release_to(), except that the data was made available on the given memory \p node instead of main memory. The \p node parameter must be exactly the same as the corresponding \c starpu_data_acquire_on_node* call. See \ref DataAccess for more details.
starpu_data_request_allocation
Explicitly ask StarPU to allocate room for a piece of data on the specified memory \p node. See \ref DataPrefetch for more details.
starpu_data_set_coordinates
Set the coordinates of the data, to be shown in various profiling tools. \p dimensions is the number of subsequent \c int parameters. This can be for instance the tile coordinates within a big matrix. See \ref CreatingAGanttDiagram for more details.
starpu_data_set_coordinates_array
Set the coordinates of the data, to be shown in various profiling tools. \p dimensions is the size of the \p dims array. This can be for instance the tile coordinates within a big matrix. See \ref CreatingAGanttDiagram for more details.
starpu_data_set_default_sequential_consistency_flag
Set the default sequential consistency flag. If a non-zero value is passed, a sequential data consistency will be enforced for all handles registered after this function call, otherwise it is disabled. By default, StarPU enables sequential data consistency. It is also possible to select the data consistency mode of a specific data handle with the function starpu_data_set_sequential_consistency_flag(). See \ref SequentialConsistency for more details.
starpu_data_set_name
Set the name of the data, to be shown in various profiling tools. See \ref CreatingAGanttDiagram for more details.
starpu_data_set_ooc_flag
Set whether this data should be elligible to be evicted to disk storage (1) or not (0). The default is 1. See \ref OOCDataRegistration for more details.
starpu_data_set_reduction_methods
Set the codelets to be used for \p handle when it is accessed in the mode ::STARPU_REDUX. Per-worker buffers will be initialized with the codelet \p init_cl (which has to take one handle with ::STARPU_W), and reduction between per-worker buffers will be done with the codelet \p redux_cl (which has to take a first accumulation handle with ::STARPU_RW|::STARPU_COMMUTE, and a second contribution handle with ::STARPU_R). See \ref DataReduction and \ref TemporaryData for more details.
starpu_data_set_reduction_methods_with_args
Same as starpu_data_set_reduction_methods() but allows to pass arguments to the reduction and init tasks
starpu_data_set_sched_data
Set the field \c sched_data for the \p handle to \p sched_data . It can then be retrieved with starpu_data_get_sched_data(). \p sched_data can be any scheduler-defined value. See \ref DataHandlesHelpers for more details.
starpu_data_set_sequential_consistency_flag
Set the data consistency mode associated to a data handle. The consistency mode set using this function has the priority over the default mode which can be set with starpu_data_set_default_sequential_consistency_flag(). See \ref SequentialConsistency and \ref DataManagementAllocation for more details.
starpu_data_set_user_data
Set the field \c user_data for the \p handle to \p user_data . It can then be retrieved with starpu_data_get_user_data(). \p user_data can be any application-defined value, for instance a pointer to an object-oriented container for the data. See \ref DataHandlesHelpers for more details.
starpu_data_set_wt_mask
Set the write-through mask of the data \p handle (and its children), i.e. a bitmask of nodes where the data should be always replicated after modification. It also prevents the data from being evicted from these nodes when memory gets scarse. When the data is modified, it is automatically transferred into those memory nodes. For instance a 1<<0 write-through mask means that the CUDA workers will commit their changes in main memory (node 0). See \ref DataManagementAllocation for more details.
starpu_data_test_if_allocated_on_node
See \ref DataPrefetch for more details.
starpu_data_test_if_mapped_on_node
See \ref DataPrefetch for more details.
starpu_data_unpack
Unpack in handle the data located at \p ptr of size \p count as described by the interface of the data. The interface registered at \p handle must define a unpacking operation (see starpu_data_interface_ops). See \ref DataHandlesHelpers for more details.
starpu_data_unpack_node
Unpack in handle the data located at \p ptr of size \p count allocated on node \p node as described by the interface of the data. The interface registered at \p handle must define an unpacking operation (see starpu_data_interface_ops). See \ref DataHandlesHelpers for more details.
starpu_data_unpartition
Unapply the filter which has been applied to \p root_data, thus unpartitioning the data. The pieces of data are collected back into one big piece in the \p gathering_node (usually ::STARPU_MAIN_RAM). Tasks working on the partitioned data will be waited for by starpu_data_unpartition().
starpu_data_unpartition_readonly_submit
Similar to starpu_data_partition_submit(), but do not invalidate \p initial_handle. This allows to continue using it, but the application has to be careful not to write to \p initial_handle or \p children handles, only read from them, since the coherency is otherwise not guaranteed. This thus allows to submit various tasks which concurrently read from various partitions of the data. See \ref AsynchronousPartitioning for more details.
starpu_data_unpartition_submit
Assuming that \p initial_handle is partitioned into \p children, submit an unpartitionning of \p initial_handle, i.e. submit a gathering of the pieces on the requested \p gathering_node memory node, and submit an invalidation of the children. See \ref AsynchronousPartitioning for more details.
starpu_data_unpartition_submit_sequential_consistency
Similar to starpu_data_unpartition_submit() but also allow to specify the coherency to be used for the main data \p initial_handle through the parameter \p sequential_consistency. See \ref AsynchronousPartitioning for more details.
starpu_data_unpartition_submit_sequential_consistency_cb
Similar to starpu_data_unpartition_submit_sequential_consistency() but allow to specify a callback function for the unpartitiong task. See \ref AsynchronousPartitioning for more details.
starpu_data_unregister
Unregister a data \p handle from StarPU. If the data was automatically allocated by StarPU because the home node was -1, all automatically allocated buffers are freed. Otherwise, a valid copy of the data is put back into the home node in the buffer that was initially registered. Using a data handle that has been unregistered from StarPU results in an undefined behaviour. In case we do not need to update the value of the data in the home node, we can use the function starpu_data_unregister_no_coherency() instead. See \ref TaskSubmission for more details.
starpu_data_unregister_no_coherency
Similar to starpu_data_unregister(), except that StarPU does not put back a valid copy into the home node, in the buffer that was initially registered. See \ref DataManagementAllocation for more details.
starpu_data_unregister_submit
Destroy the data \p handle once it is no longer needed by any submitted task. No coherency is provided.
starpu_data_vget_sub_data
Similar to starpu_data_get_sub_data() but use a \c va_list for the parameter list. See \ref PartitioningData for more details.
starpu_data_vmap_filters
Apply \p nfilters filters to the handle designated by \p root_handle recursively. Use a \p va_list of pointers to variables of the type starpu_data_filter. See \ref PartitioningData for more details.
starpu_data_wont_use
Advise StarPU that \p handle will not be used in the close future, and is thus a good candidate for eviction from GPUs. StarPU will thus write its value back to its home node when the bus is idle, and select this data in priority for eviction when memory gets low. See \ref DataPrefetch for more details.
starpu_disk_close
Close an existing data opened with starpu_disk_open(). See \ref OutOfCore_Introduction for more details.
starpu_disk_open
Open an existing file memory in a disk node. \p size is the size of the file. \p pos is the specific position dependent on the backend, given to the \c open method of the disk operations. Return an opaque object pointer. See \ref OutOfCore_Introduction for more details.
starpu_disk_register
Register a disk memory node with a set of functions to manipulate data. The \c plug member of \p func will be passed \p parameter, and return a \c base which will be passed to all \p func methods.
SUCCESS: return the disk node.
FAIL: return an error code.
\p size must be at least \ref STARPU_DISK_SIZE_MIN bytes ! \p size being negative means infinite size.
starpu_display_bindings
Call hwloc-ps to display binding of each process and thread running on the machine.
Use the environment variable \ref STARPU_DISPLAY_BINDINGS to automatically call this function at the beginning of the execution of StarPU. See \ref MiscellaneousAndDebug for more details.
starpu_display_stats
Call starpu_profiling_bus_helper_display_summary() and starpu_profiling_worker_helper_display_summary(). See \ref DataStatistics for more details.
starpu_do_schedule
See \ref GraphScheduling for more details.
starpu_driver_deinit
Deinitialize the given driver. Return 0 on success, -EINVAL if starpu_driver::type is not a valid ::starpu_worker_archtype. See \ref UsingTheDriverAPI for more details.
starpu_driver_init
Initialize the given driver. Return 0 on success, -EINVAL if starpu_driver::type is not a valid ::starpu_worker_archtype. See \ref UsingTheDriverAPI for more details.
starpu_driver_run
Initialize the given driver, run it until it receives a request to terminate, deinitialize it and return 0 on success. Return -EINVAL if starpu_driver::type is not a valid StarPU device type (::STARPU_CPU_WORKER, ::STARPU_CUDA_WORKER or ::STARPU_OPENCL_WORKER).
starpu_driver_run_once
Run the driver once, then return 0 on success, -EINVAL if starpu_driver::type is not a valid ::starpu_worker_archtype. See \ref UsingTheDriverAPI for more details.
starpu_drivers_preinit
Pre-initialize drivers So as to register information on device types, memory types, etc. Only use internally by StarPU.
starpu_drivers_request_termination
Notify all running drivers that they should terminate. See \ref UsingTheDriverAPI for more details.
starpu_energy_start
starpu_energy_start - start counting hardware events in an event set
starpu_energy_stop
starpu_energy_stop - stop counting hardware events in an event set
starpu_energy_use
Account for \p joules J being used. This is support in simgrid mode, to record how much energy was used, and will show up in further call to starpu_energy_used(). See \ref Energy-basedScheduling fore more details.
starpu_energy_used
Return the amount of energy having been used in J. This account the amounts passed to starpu_energy_use(), but also the static energy use set by the \ref STARPU_IDLE_POWER environment variable. See \ref Energy-basedScheduling fore more details.
starpu_execute_on_each_worker
Execute the given function \p func on a subset of workers. When calling this method, the offloaded function \p func is executed by every StarPU worker that are eligible to execute the function. The argument \p arg is passed to the offloaded function. The argument \p where specifies on which types of processing units the function should be executed. Similarly to the field starpu_codelet::where, it is possible to specify that the function should be executed on every CUDA device and every CPU by passing ::STARPU_CPU|::STARPU_CUDA. This function blocks until \p func has been executed on every appropriate processing units, and thus may not be called from a callback function for instance. See \ref HowToInitializeAComputationLibraryOnceForEachWorker for more details.
starpu_execute_on_each_worker_ex
Same as starpu_execute_on_each_worker(), except that the task name is specified in the argument \p name. See \ref HowToInitializeAComputationLibraryOnceForEachWorker for more details.
starpu_execute_on_specific_workers
Call \p func(\p arg) on every worker in the \p workers array. \p num_workers indicates the number of workers in this array. This function is synchronous, but the different workers may execute the function in parallel. See \ref HowToInitializeAComputationLibraryOnceForEachWorker for more details.
starpu_filter_nparts_compute_chunk_size_and_offset
Given an integer \p n, \p n the number of parts it must be divided in, \p id the part currently considered, determines the \p chunk_size and the \p offset, taking into account the size of the elements stored in the data structure \p elemsize and \p blocksize, which is most often 1. See \ref DefiningANewDataFilter for more details.
starpu_free
@deprecated Free memory which has previously been allocated with starpu_malloc(). This function is deprecated, one should use starpu_free_noflag(). See \ref DataManagementAllocation for more details.
starpu_free_flags
Free memory by specifying its size. The given flags should be consistent with the ones given to starpu_malloc_flags() when allocating the memory. See \ref HowToLimitMemoryPerNode for more details.
starpu_free_noflag
Free memory by specifying its size. Should be used for memory allocated with starpu_malloc(). See \ref DataManagementAllocation for more details.
starpu_free_on_node
Free \p addr of \p size bytes on node \p dst_node which was previously allocated with starpu_malloc_on_node().
starpu_free_on_node_flags
Free \p addr of \p size bytes on node \p dst_node which was previously allocated with starpu_malloc_on_node_flags() with the given allocation \p flags.
starpu_fxt_autostart_profiling
Determine whether profiling should be started by starpu_init(), or only when starpu_fxt_start_profiling() is called. \p autostart should be 1 to do so, or 0 to prevent it. This function has to be called before starpu_init(). See \ref LimitingScopeTrace for more details.
starpu_fxt_generate_trace
starpu_fxt_is_enabled
Wrapper to get value of env variable STARPU_FXT_TRACE
starpu_fxt_options_init
starpu_fxt_options_shutdown
starpu_fxt_start_profiling
Start recording the trace. The trace is by default started from starpu_init() call, but can be paused by using starpu_fxt_stop_profiling(), in which case starpu_fxt_start_profiling() should be called to resume recording events. See \ref LimitingScopeTrace for more details.
starpu_fxt_stop_profiling
Stop recording the trace. The trace is by default stopped when calling starpu_shutdown(). starpu_fxt_stop_profiling() can however be used to stop it earlier. starpu_fxt_start_profiling() can then be called to start recording it again, etc. See \ref LimitingScopeTrace for more details.
starpu_fxt_trace_user_event
Add an event in the execution trace if FxT is enabled. See \ref CreatingAGanttDiagram for more details.
starpu_fxt_trace_user_event_string
Add a string event in the execution trace if FxT is enabled. See \ref CreatingAGanttDiagram for more details.
starpu_fxt_write_data_trace
starpu_fxt_write_data_trace_in_dir
starpu_get_env_size_default
If the environment variable \c str is defined with a well-defined size value, return the value as a size in bytes. Expected size qualifiers are b, B, k, K, m, M, g, G. The default qualifier is K. If the environment variable \c str is not defined or is empty, return \c defval Raise an error if the value of the environment variable \c str is not well-defined. See \ref ExecutionConfigurationThroughEnvironmentVariables for more details.
starpu_get_env_string_var_default
If the environment variable \c str is defined and its value is contained in the array \c strings, return the array position. Raise an error if the environment variable \c str is defined with a value not in \c strings Return \c defvalue if the environment variable \c str is not defined. See \ref ExecutionConfigurationThroughEnvironmentVariables for more details.
starpu_get_hwloc_topology
Get the hwloc topology used by StarPU. One can use this pointer to get information about topology, but not to change settings related to topology. See \ref HardwareTopology for more details.
starpu_get_memory_location_bitmap
Return a bitmap representing logical indexes of NUMA nodes where the buffer targeted by \p ptr is allocated. An error is notified by a negative result. See \ref HardwareTopology for more details.
starpu_get_next_bindid
Return a PU binding ID which can be used to bind threads with starpu_bind_thread_on(). \p flags can be set to ::STARPU_THREAD_ACTIVE or 0. When \p npreferred is set to non-zero, \p preferred is an array of size \p npreferred in which a preference of PU binding IDs can be set. By default StarPU will return the first PU available for binding. See \ref KernelThreadsStartedByStarPU and \ref cpuWorkers for more details.
starpu_get_prefetch_flag
Whether \ref STARPU_PREFETCH was set. See \ref SchedulingHelpers for more details.
starpu_get_pu_os_index
If \c hwloc is used, convert the given \p logical_index of a PU to the OS index of this PU. If \c hwloc is not used, return \p logical_index. See \ref HardwareTopology for more details.
starpu_get_sched_lib_policies
Allow an external library to return a list of scheduling policies to be loaded dynamically. See \ref UsingaNewSchedulingPolicy for more details.
starpu_get_sched_lib_policy
Allow an external library to return a scheduling policy to be loaded dynamically. See \ref UsingaNewSchedulingPolicy for more details.
starpu_get_version
Return as 3 integers the version of StarPU used when running the application. See \ref ConfigurationAndInitialization for more details.
starpu_getenv
Retrieve the value of an environment variable. See \ref ExecutionConfigurationThroughEnvironmentVariables for more details.
starpu_hash_crc32c_be
Compute the CRC of a 32bit number seeded by the \p inputcrc current state. The return value should be considered as the new current state for future CRC computation. This is used for computing data size footprint. See \ref DefiningANewDataInterface_footprint for more details.
starpu_hash_crc32c_be_n
Compute the CRC of a byte buffer seeded by the \p inputcrc current state. The return value should be considered as the new current state for future CRC computation. This is used for computing data size footprint. See \ref DefiningANewDataInterface_footprint for more details.
starpu_hash_crc32c_be_ptr
Compute the CRC of a pointer value seeded by the \p inputcrc current state. The return value should be considered as the new current state for future CRC computation. This is used for computing data size footprint. See \ref DefiningANewDataInterface_footprint for more details.
starpu_hash_crc32c_string
Compute the CRC of a string seeded by the \p inputcrc current state. The return value should be considered as the new current state for future CRC computation. This is used for computing data size footprint. See \ref DefiningANewDataInterface_footprint for more details.
starpu_hip_worker_get_count
Return the number of HIP devices controlled by StarPU. The return value should be at most \ref STARPU_MAXHIPDEVS. See \ref TopologyWorkers for more details.
starpu_hipblas_get_local_handle
Return the HIPBLAS handle to be used to queue HIPBLAS kernels. It is properly initialized and configured for multistream by starpu_hipblas_init().
starpu_hipblas_init
Initialize HIPBLAS on every HIPdevice. The HIPBLAS library must be initialized prior to any HIPBLAS call. Calling starpu_hipblas_init() will initialize HIPBLAS on every HIP device controlled by StarPU. This call blocks until HIPBLAS has been properly initialized on every device.
starpu_hipblas_shutdown
Synchronously deinitialize the HIPBLAS library on every HIP device.
starpu_idle_hook_deregister
starpu_idle_hook_register
starpu_idle_prefetch_task_input_for
Prefetch data for a given p task on a given p worker when the bus is idle. See \ref SchedulingHelpers for more details.
starpu_idle_prefetch_task_input_for_prio
Prefetch data for a given p task on a given p worker when the bus is idle with a given priority. See \ref SchedulingHelpers for more details.
starpu_idle_prefetch_task_input_on_node
Prefetch data for a given p task on a given p node when the bus is idle. See \ref SchedulingHelpers for more details.
starpu_idle_prefetch_task_input_on_node_prio
Prefetch data for a given p task on a given p node when the bus is idle with a given priority. See \ref SchedulingHelpers for more details.
starpu_init
StarPU initialization method, must be called prior to any other StarPU call. It is possible to specify StarPU’s configuration (e.g. scheduling policy, number of cores, …) by passing a non-NULL \p conf. Default configuration is used if \p conf is NULL. Upon successful completion, this function returns 0. Otherwise, -ENODEV indicates that no worker was available (and thus StarPU was not initialized). See \ref SubmittingATask for more details.
starpu_initialize
Similar to starpu_init(), but also take the \p argc and \p argv as defined by the application, which is necessary when running in Simgrid mode or MPI Master Slave mode. Do not call starpu_init() and starpu_initialize() in the same program. See \ref SubmittingATask for more details.
starpu_insert_task
Identical to starpu_task_insert(). Kept to avoid breaking old codes.
starpu_interface_copy
Copy \p size bytes from byte offset \p src_offset of \p src on \p src_node to byte offset \p dst_offset of \p dst on \p dst_node. This is to be used in the starpu_data_copy_methods::any_to_any copy method, which is provided with \p async_data to be passed to starpu_interface_copy(). this returns -EAGAIN if the transfer is still ongoing, or 0 if the transfer is already completed.
starpu_interface_copy2d
Copy \p numblocks blocks of \p blocksize bytes from byte offset \p src_offset of \p src on \p src_node to byte offset \p dst_offset of \p dst on \p dst_node.
starpu_interface_copy3d
Copy \p numblocks_1 * \p numblocks_2 blocks of \p blocksize bytes from byte offset \p src_offset of \p src on \p src_node to byte offset \p dst_offset of \p dst on \p dst_node.
starpu_interface_copy4d
Copy \p numblocks_1 * \p numblocks_2 * \p numblocks_3 blocks of \p blocksize bytes from byte offset \p src_offset of \p src on \p src_node to byte offset \p dst_offset of \p dst on \p dst_node.
starpu_interface_copynd
Copy \p nn[1] * \p nn[2]…* \p nn[ndim-1] blocks of \p nn[0] * \p elemsize bytes from byte offset \p src_offset of \p src on \p src_node to byte offset \p dst_offset of \p dst on \p dst_node.
starpu_interface_data_copy
Record in offline execution traces the copy of \p size bytes from node \p src_node to node \p dst_node. See \ref DefiningANewDataInterface_copy for more details.
starpu_interface_end_driver_copy_async
See starpu_interface_start_driver_copy_async(). See \ref DefiningANewDataInterface_copy for more details.
starpu_interface_map
Used to set starpu_data_interface_ops::map_data. See \ref DefiningANewDataInterface_pointers for more details.
starpu_interface_start_driver_copy_async
When an asynchronous implementation of the data transfer is implemented, the call to the underlying CUDA, OpenCL, etc. call should be surrounded by calls to starpu_interface_start_driver_copy_async() and starpu_interface_end_driver_copy_async(), so that it is recorded in offline execution traces, and the timing of the submission is checked. \p start must point to a variable whose value will be passed unchanged to starpu_interface_end_driver_copy_async().
starpu_interface_unmap
Used to set starpu_data_interface_ops::unmap_data. See \ref DefiningANewDataInterface_pointers for more details.
starpu_interface_update_map
Used to set starpu_data_interface_ops::update_map. See \ref DefiningANewDataInterface_pointers for more details.
starpu_is_initialized
Return 1 if StarPU is already initialized. See \ref ConfigurationAndInitialization for more details.
starpu_is_paused
Return !0 if task processing by workers is currently paused, 0 otherwise. See \ref StarPUEatsCPUs for more details.
starpu_iteration_pop
Drop the iteration number for submitted tasks. This must match a previous call to starpu_iteration_push(), and is typically called at the end of a task submission loop. See \ref CreatingAGanttDiagram for more details.
starpu_iteration_push
Set the iteration number for all the tasks to be submitted after this call. This is typically called at the beginning of a task submission loop. This number will then show up in tracing tools. A corresponding starpu_iteration_pop() call must be made to match the call to starpu_iteration_push(), at the end of the same task submission loop, typically.
starpu_malloc
Allocate data of the given size \p dim in main memory, and return the pointer to the allocated data through \p A. It will also try to pin it in CUDA or OpenCL, so that data transfers from this buffer can be asynchronous, and thus permit data transfer and computation overlapping. The allocated buffer must be freed thanks to the starpu_free_noflag() function. See \ref DataManagementAllocation for more details.
starpu_malloc_flags
Perform a memory allocation based on the constraints defined by the given flag. See \ref HowToLimitMemoryPerNode for more details.
starpu_malloc_on_node
Allocate \p size bytes on node \p dst_node with the default allocation flags. This returns 0 if allocation failed, the allocation method should then return -ENOMEM as allocated size. Deallocation must be done with starpu_free_on_node().
starpu_malloc_on_node_flags
Allocate \p size bytes on node \p dst_node with the given allocation \p flags. This returns 0 if allocation failed, the allocation method should then return -ENOMEM as allocated size. Deallocation must be done with starpu_free_on_node_flags().
starpu_malloc_on_node_set_default_flags
Define the default flags for allocations performed by starpu_malloc_on_node() and starpu_free_on_node(). The default is \ref STARPU_MALLOC_PINNED | \ref STARPU_MALLOC_COUNT. See \ref HowToLimitMemoryPerNode for more details.
starpu_malloc_set_align
Set an alignment constraints for starpu_malloc() allocations. \p align must be a power of two. This is for instance called automatically by the OpenCL driver to specify its own alignment constraints. See \ref DataManagementAllocation for more details.
starpu_malloc_set_hooks
Set allocation functions to be used by StarPU. By default, StarPU will use \c malloc() (or \c cudaHostAlloc() if CUDA GPUs are used) for all its data handle allocations. The application can specify another allocation primitive by calling this. The malloc_hook should pass the allocated pointer through the \c A parameter, and return 0 on success. On allocation failure, it should return -ENOMEM. The \c flags parameter contains ::STARPU_MALLOC_PINNED if the memory should be pinned by the hook for GPU transfer efficiency. The hook can use starpu_memory_pin() to achieve this. The \c dst_node parameter is the starpu memory node, one can convert it to an hwloc logical id with starpu_memory_nodes_numa_id_to_hwloclogid() or to an OS NUMA number with starpu_memory_nodes_numa_devid_to_id(). See \ref DataManagementAllocation for more details.
starpu_map_enabled
Return 1 if memory mapping support between memory nodes is enabled. See \ref Basic for more details.
starpu_matrix_data_register
Register the \p nx x \p ny 2D matrix of \p elemsize-byte elements pointed by \p ptr and initialize \p handle to represent it. \p ld specifies the number of elements between rows. a value greater than \p nx adds padding, which can be useful for alignment purposes.
starpu_matrix_data_register_allocsize
Similar to starpu_matrix_data_register, but additionally specifies which allocation size should be used instead of the initial nxnyelemsize.
starpu_matrix_filter_block
Partition a dense Matrix along the x dimension, thus getting (x/\p nparts ,y) matrices. If \p nparts does not divide x, the last submatrix contains the remainder.
starpu_matrix_filter_block_shadow
Partition a dense Matrix along the x dimension, with a shadow border starpu_data_filter::filter_arg_ptr, thus getting ((x-2shadow)/\p nparts +2shadow,y) matrices. If \p nparts does not divide x-2*shadow, the last submatrix contains the remainder.
starpu_matrix_filter_pick_variable
Pick \p nparts contiguous variables from a matrix. The starting position is set in starpu_data_filter::filter_arg_ptr.
starpu_matrix_filter_pick_variable_child_ops
Return the child_ops of the partition obtained with starpu_matrix_filter_pick_variable(). See \ref MatrixDataInterface for more details.
starpu_matrix_filter_pick_vector_child_ops
Return the child_ops of the partition obtained with starpu_matrix_filter_pick_vector_y(). See \ref MatrixDataInterface for more details.
starpu_matrix_filter_pick_vector_y
Pick \p nparts contiguous vectors from a matrix along the Y dimension. The starting position on Y-axis is set in starpu_data_filter::filter_arg_ptr.
starpu_matrix_filter_vertical_block
Partition a dense Matrix along the y dimension, thus getting (x,y/\p nparts) matrices. If \p nparts does not divide y, the last submatrix contains the remainder.
starpu_matrix_filter_vertical_block_shadow
Partition a dense Matrix along the y dimension, with a shadow border starpu_data_filter::filter_arg_ptr, thus getting (x,(y-2shadow)/\p nparts +2shadow) matrices. If \p nparts does not divide y-2*shadow, the last submatrix contains the remainder.
starpu_matrix_get_allocsize
Return the allocated size of the matrix designated by \p handle.
starpu_matrix_get_elemsize
Return the size of the elements registered into the matrix designated by \p handle.
starpu_matrix_get_local_ld
Return the number of elements between each row of the matrix designated by \p handle. Maybe be equal to nx when there is no padding.
starpu_matrix_get_local_ptr
Return the local pointer associated with \p handle.
starpu_matrix_get_nx
Return the number of elements on the x-axis of the matrix designated by \p handle.
starpu_matrix_get_ny
Return the number of elements on the y-axis of the matrix designated by \p handle.
starpu_matrix_ptr_register
Register into the \p handle that to store data on node \p node it should use the buffer located at \p ptr, or device handle \p dev_handle and offset \p offset (for OpenCL, notably), with \p ld elements between rows.
starpu_memchunk_tidy
See \ref DataPrefetch for more details.
starpu_memory_allocate
If a memory limit is defined on the given node (see Section \ref HowToLimitMemoryPerNode), try to allocate some of it. This does not actually allocate memory, but only accounts for it. This can be useful when the application allocates data another way, but want StarPU to be aware of the allocation size e.g. for memory reclaiming. By default, return -ENOMEM if there is not enough room on the given node. \p flags can be either ::STARPU_MEMORY_WAIT or ::STARPU_MEMORY_OVERFLOW to change this. See \ref HowToLimitMemoryPerNode for more details.
starpu_memory_deallocate
If a memory limit is defined on the given node (see Section \ref HowToLimitMemoryPerNode), free some of it. This does not actually free memory, but only accounts for it, like starpu_memory_allocate(). The amount does not have to be exactly the same as what was passed to starpu_memory_allocate(), only the eventual amount needs to be the same, i.e. one call to starpu_memory_allocate() can be followed by several calls to starpu_memory_deallocate() to declare the deallocation piece by piece. See \ref HowToLimitMemoryPerNode for more details.
starpu_memory_get_available
If a memory limit is defined on the given node (see Section \ref HowToLimitMemoryPerNode), return the amount of available memory on the node. Otherwise return -1. See \ref HowToLimitMemoryPerNode for more details.
starpu_memory_get_available_all_nodes
Return the amount of available memory on all memory nodes for whose a memory limit is defined (see Section \ref DataManagementAllocation).
starpu_memory_get_total
If a memory limit is defined on the given node (see Section \ref HowToLimitMemoryPerNode), return the amount of total memory on the node. Otherwise return -1. See \ref HowToLimitMemoryPerNode for more details.
starpu_memory_get_total_all_nodes
Return the amount of total memory on all memory nodes for whose a memory limit is defined (see Section \ref DataManagementAllocation).
starpu_memory_get_used
Return the amount of used memory on the node. See \ref DataManagementAllocation for more details.
starpu_memory_get_used_all_nodes
Return the amount of used memory on all memory nodes. See \ref DataManagementAllocation for more details.
starpu_memory_node_get_devid
See \ref TopologyMemory for more details.
starpu_memory_node_get_ids_by_type
Get the list of memory nodes of kind \p kind. Fill the array \p memory_nodes_ids with the memory nodes numbers. The argument \p maxsize indicates the size of the array \p memory_nodes_ids. The return value gives the number of node numbers that were put in the array. -ERANGE is returned if \p maxsize is lower than the number of memory nodes with the appropriate kind: in that case, the array is filled with the \p maxsize first elements. To avoid such overflows, the value of maxsize can be chosen by the means of function starpu_memory_nodes_get_count_by_kind(), or by passing a value greater or equal to \ref STARPU_MAXNODES. See \ref TopologyWorkers for more details.
starpu_memory_node_get_name
Return in \p name the name of a memory node (NUMA 0, CUDA 0, etc.) \p size is the size of the \p name array. See \ref TopologyWorkers for more details.
starpu_memory_node_get_worker_archtype
Return the type of worker which operates on memory node kind \p node_kind. See \ref TopologyWorkers for more details.
starpu_memory_nodes_get_count
Return the number of memory nodes. See \ref TopologyWorkers for more details.
starpu_memory_nodes_get_count_by_kind
Return the number of memory nodes of a given \p kind. See \ref TopologyWorkers for more details.
starpu_memory_nodes_get_numa_count
Return the number of NUMA nodes used by StarPU. See \ref TopologyWorkers for more details.
starpu_memory_nodes_numa_devid_to_id
Return the Operating System identifier of the memory node whose StarPU identifier is \p id. See \ref TopologyWorkers for more details.
starpu_memory_nodes_numa_id_to_devid
Return the identifier of the memory node associated to the NUMA node identified by \p osid by the Operating System. See \ref TopologyWorkers for more details.
starpu_memory_pin
Pin the given memory area, so that CPU-GPU transfers can be done asynchronously with DMAs. The memory must be unpinned with starpu_memory_unpin() before being freed. Return 0 on success, -1 on error. See \ref DataManagementAllocation for more details.
starpu_memory_unpin
Unpin the given memory area previously pinned with starpu_memory_pin(). Return 0 on success, -1 on error. See \ref DataManagementAllocation for more details.
starpu_memory_wait_available
If a memory limit is defined on the given node (see Section \ref HowToLimitMemoryPerNode), this will wait for \p size bytes to become available on \p node. Of course, since another thread may be allocating memory concurrently, this does not necessarily mean that this amount will be actually available, just that it was reached. To atomically wait for some amount of memory and reserve it, starpu_memory_allocate() should be used with the ::STARPU_MEMORY_WAIT flag. See \ref HowToLimitMemoryPerNode for more details.
starpu_mpi_ms_worker_get_count
Return the number of MPI Master Slave workers controlled by StarPU. See \ref TopologyWorkers for more details.
starpu_multiformat_data_register
Register a piece of data that can be represented in different ways, depending upon the processing unit that manipulates it. It allows the programmer, for instance, to use an array of structures when working on a CPU, and a structure of arrays when working on a GPU. \p nobjects is the number of elements in the data. \p format_ops describes the format. See \ref TheMultiformatInterface for more details.
starpu_ndim_data_register
Register the \p nn[0] x \p nn[1] x … \p ndim-dimension matrix of \p elemsize byte elements pointed by \p ptr and initialize \p handle to represent it. Again, \p ldn, specifies the number of elements between two units on each dimension.
starpu_ndim_filter_1d_pick_variable
Pick \p nparts contiguous variables from a 1-dim array. The starting position is set in starpu_data_filter::filter_arg_ptr.
starpu_ndim_filter_2d_pick_vector
Pick \p nparts contiguous vectors from a 2-dim array along the given dimension set in starpu_data_filter::filter_arg. The starting position is set in starpu_data_filter::filter_arg_ptr.
starpu_ndim_filter_3d_pick_matrix
Pick \p nparts contiguous matrices from a 3-dim array along the given dimension set in starpu_data_filter::filter_arg. The starting position is set in starpu_data_filter::filter_arg_ptr.
starpu_ndim_filter_4d_pick_block
Pick \p nparts contiguous blocks from a 4-dim array along the given dimension set in starpu_data_filter::filter_arg. The starting position is set in starpu_data_filter::filter_arg_ptr.
starpu_ndim_filter_5d_pick_tensor
Pick \p nparts contiguous tensors from a 5-dim array along the given dimension set in starpu_data_filter::filter_arg. The starting position is set in starpu_data_filter::filter_arg_ptr.
starpu_ndim_filter_block
Partition a ndim array along the given dimension set in starpu_data_filter::filter_arg. If \p nparts does not divide the element number on dimension, the last submatrix contains the remainder.
starpu_ndim_filter_block_shadow
Partition a ndim array along the given dimension set in starpu_data_filter::filter_arg, with a shadow border starpu_data_filter::filter_arg_ptr. If \p nparts does not divide the element number on dimension, the last submatrix contains the remainder.
starpu_ndim_filter_pick_block_child_ops
Return the child_ops of the partition obtained with starpu_ndim_filter_pick_block(). See \ref NdimDataInterface for more details.
starpu_ndim_filter_pick_matrix_child_ops
Return the child_ops of the partition obtained with starpu_ndim_filter_pick_matrix(). See \ref NdimDataInterface for more details.
starpu_ndim_filter_pick_ndim
Pick \p nparts contiguous (n-1)dim arrays from a ndim array along the given dimension set in starpu_data_filter::filter_arg. The starting position is set in starpu_data_filter::filter_arg_ptr.
starpu_ndim_filter_pick_tensor_child_ops
Return the child_ops of the partition obtained with starpu_ndim_filter_pick_tensor(). See \ref NdimDataInterface for more details.
starpu_ndim_filter_pick_variable
Pick \p nparts contiguous variables from a ndim array. The starting position is set in starpu_data_filter::filter_arg_ptr.
starpu_ndim_filter_pick_variable_child_ops
Return the child_ops of the partition obtained with starpu_ndim_filter_pick_variable(). See \ref NdimDataInterface for more details.
starpu_ndim_filter_pick_vector_child_ops
Return the child_ops of the partition obtained with starpu_ndim_filter_pick_vector(). See \ref NdimDataInterface for more details.
starpu_ndim_filter_to_block
Partition a 3-dim array into \p nparts blocks along the given dimension set in starpu_data_filter::filter_arg.
starpu_ndim_filter_to_block_child_ops
Return the child_ops of the partition obtained with starpu_ndim_filter_to_block(). See \ref NdimDataInterface for more details.
starpu_ndim_filter_to_matrix
Partition a 2-dim array into \p nparts matrices along the given dimension set in starpu_data_filter::filter_arg.
starpu_ndim_filter_to_matrix_child_ops
Return the child_ops of the partition obtained with starpu_ndim_filter_to_matrix(). See \ref NdimDataInterface for more details.
starpu_ndim_filter_to_tensor
Partition a 4-dim array into \p nparts tensors along the given dimension set in starpu_data_filter::filter_arg.
starpu_ndim_filter_to_tensor_child_ops
Return the child_ops of the partition obtained with starpu_ndim_filter_to_tensor(). See \ref NdimDataInterface for more details.
starpu_ndim_filter_to_variable
Transfer a 0-dim array to a variable.
starpu_ndim_filter_to_variable_child_ops
Return the child_ops of the partition obtained with starpu_ndim_filter_to_variable(). See \ref NdimDataInterface for more details.
starpu_ndim_filter_to_vector
Partition a 1-dim array into \p nparts vectors.
starpu_ndim_filter_to_vector_child_ops
Return the child_ops of the partition obtained with starpu_ndim_filter_to_vector(). See \ref NdimDataInterface for more details.
starpu_ndim_get_elemsize
Return the size of the elements of the ndim array designated by \p handle.
starpu_ndim_get_local_ldi
Return the number of elements between two units i-axis dimension of the ndim array designated by \p handle, in the format of the current memory node.
starpu_ndim_get_local_ldn
Return the number of elements between two units on each dimension of the ndim array designated by \p handle, in the format of the current memory node.
starpu_ndim_get_local_ptr
Return the local pointer associated with \p handle.
starpu_ndim_get_ndim
Return the dimension size.
starpu_ndim_get_ni
Return the number of elements on the i-axis of the ndim array designated by \p handle. When i=0, it means x-axis, when i=1, it means y-axis, when i=2, it means z-axis, etc.
starpu_ndim_get_nn
Return the number of elements on each dimension of the ndim array designated by \p handle.
starpu_ndim_ptr_register
Register into the \p handle that to store data on node \p node it should use the buffer located at \p ptr, or device handle \p dev_handle and offset \p offset (for OpenCL, notably), with \p ldn elements between two units on each dimension.
starpu_node_get_kind
Return the type of \p node as defined by ::starpu_node_kind. For example, when defining a new data interface, this function should be used in the allocation function to determine on which device the memory needs to be allocated. See \ref TopologyWorkers for more details.
starpu_omp_atomic_fallback_inline_begin
Implement the entry point of a fallback global atomic region. Block until it succeeds in acquiring exclusive access to the global atomic region.
starpu_omp_atomic_fallback_inline_end
Implement the exit point of a fallback global atomic region. Release the exclusive access to the global atomic region.
starpu_omp_barrier
Wait until each participating thread of the innermost OpenMP parallel region has reached the barrier and each explicit OpenMP task bound to this region has completed its execution.
starpu_omp_critical
Wait until no other thread is executing within the context of the selected critical section, then proceeds to the exclusive execution of a function within the critical section. \p f is the function to be executed in the critical section. \p arg is an argument passed to function \p f. \p name is the name of the selected critical section. If name == NULL, the selected critical section is the unique anonymous critical section.
starpu_omp_critical_inline_begin
Wait until execution can proceed exclusively within the context of the selected critical section. \p name is the name of the selected critical section. If name == NULL, the selected critical section is the unique anonymous critical section.
starpu_omp_critical_inline_end
End the exclusive execution within the context of the selected critical section. \p name is the name of the selected critical section. If name==NULL, the selected critical section is the unique anonymous critical section.
starpu_omp_data_lookup
Return the handle corresponding to the data pointed to by the \p ptr host pointer.
starpu_omp_destroy_lock
Destroy an opaque lock object.
starpu_omp_destroy_nest_lock
Destroy an opaque lock object supporting nested locking operations.
starpu_omp_for
Execute a parallel loop together with the other threads participating to the innermost parallel region. \p f is the function to be executed iteratively. \p arg is an argument passed to function \p f. \p nb_iterations is the number of iterations to be performed by the parallel loop. \p chunk is the number of consecutive iterations that should be affected to the same thread when scheduling the loop workshares, it follows the semantics of the \c modifier argument in OpenMP #pragma omp for specification. \p schedule is the scheduling mode according to the OpenMP specification. \p ordered is a flag indicating whether the loop region may contain an ordered section (ordered==!0) or not (ordered==0). \p nowait is a flag indicating whether an implicit barrier is requested after the for section (nowait==0) or not (nowait==!0).
starpu_omp_for_alt
Alternative implementation of a parallel loop. Differ from #starpu_omp_for in the expected arguments of the loop function \c f.
starpu_omp_for_inline_first
Decide whether the current thread should start to execute a parallel loop section. See #starpu_omp_for for the argument description.
starpu_omp_for_inline_first_alt
Inline version of the alternative implementation of a parallel loop.
starpu_omp_for_inline_next
Decide whether the current thread should continue to execute a parallel loop section. See #starpu_omp_for for the argument description.
starpu_omp_for_inline_next_alt
Inline version of the alternative implementation of a parallel loop.
starpu_omp_get_active_level
Return the nestinglevel of the current innermost active parallel section.
starpu_omp_get_ancestor_thread_num
Return the number of the ancestor of the current parallel section.
starpu_omp_get_cancellation
Return the state of the cancel ICVS var.
starpu_omp_get_default_arbiter
Only use internally by StarPU.
starpu_omp_get_default_device
Return the number of the device used as default.
starpu_omp_get_dynamic
Return the state of dynamic thread number adjustment.
starpu_omp_get_initial_device
Return a device number that represents the host device.
starpu_omp_get_level
Return the nesting level of the current parallel section.
starpu_omp_get_max_active_levels
Return the current maximum number of allowed active parallel section levels
starpu_omp_get_max_task_priority
Return the maximum value that can be specified in the priority clause.
starpu_omp_get_max_threads
Return the maximum number of threads that can be used to create a region from the current region.
starpu_omp_get_nested
Return whether nested parallel sections are enabled or not.
starpu_omp_get_num_devices
Return the number of the devices.
starpu_omp_get_num_places
Return the number of places available to the execution environment in the place list.
starpu_omp_get_num_procs
Return the number of StarPU CPU workers.
starpu_omp_get_num_teams
Return the number of teams in the current teams region.
starpu_omp_get_num_threads
Return the number of threads of the current region.
starpu_omp_get_partition_num_places
Return the number of places in the place partition of the innermost implicit task.
starpu_omp_get_partition_place_nums
Return the list of place numbers corresponding to the places in the place-partition-var ICV of the innermost implicit task.
starpu_omp_get_place_num
Return the place number of the place to which the encountering thread is bound.
starpu_omp_get_place_num_procs
Return the number of processors available to the execution environment in the specified place.
starpu_omp_get_place_proc_ids
Return the numerical identifiers of the processors available to the execution environment in the specified place.
starpu_omp_get_proc_bind
Return the proc_bind setting of the current parallel region.
starpu_omp_get_schedule
Return the kind and the modifier of the current default loop scheduler.
starpu_omp_get_team_num
Return the team number of the calling thread.
starpu_omp_get_team_size
Return the size of the team of the current parallel section.
starpu_omp_get_thread_limit
Return the number of StarPU CPU workers.
starpu_omp_get_thread_num
Return the rank of the current thread among the threads of the current region.
starpu_omp_get_wtick
Return the precision of the time used by \p starpu_omp_get_wtime().
starpu_omp_get_wtime
Return the elapsed wallclock time in seconds.
starpu_omp_handle_register
Register a handle for ptr->handle data lookup.
starpu_omp_handle_unregister
Unregister a handle from ptr->handle data lookup.
starpu_omp_in_final
Check whether the current task is final or not.
starpu_omp_in_parallel
Return whether it is called from the scope of a parallel region or not.
starpu_omp_init
Initialize StarPU and its OpenMP Runtime support. See \ref OMPInitExit for more details.
starpu_omp_init_lock
Initialize an opaque lock object.
starpu_omp_init_nest_lock
Initialize an opaque lock object supporting nested locking operations.
starpu_omp_is_initial_device
Check whether the current device is the initial device or not.
starpu_omp_master
Execute a function only on the master thread of the OpenMP parallel region it is called from. When called from a thread that is not the master of the parallel region it is called from, this function does nothing. \p f is the function to be called. \p arg is an argument passed to function \p f.
starpu_omp_master_inline
Determine whether the calling thread is the master of the OpenMP parallel region it is called from or not.
starpu_omp_ordered
Ensure that a function is sequentially executed once for each iteration in order within a parallel loop, by the thread that own the iteration. \p f is the function to be executed by the thread that own the current iteration. \p arg is an argument passed to function \p f.
starpu_omp_ordered_inline_begin
Wait until all the iterations of a parallel loop below the iteration owned by the current thread have been executed.
starpu_omp_ordered_inline_end
Notify that the ordered section for the current iteration has been completed.
starpu_omp_parallel_region
Generate and launch an OpenMP parallel region and return after its completion. \p attr specifies the attributes for the generated parallel region. If this function is called from inside another, generating, parallel region, the generated parallel region is nested within the generating parallel region.
starpu_omp_sections
Ensure that each function of a given array of functions is executed by one and only one thread. \p nb_sections is the number of functions in the array \p section_f. \p section_f is the array of functions to be executed as sections. \p section_arg is an array of arguments to be passed to the corresponding function. \p nowait is a flag indicating whether an implicit barrier is requested after the execution of all the sections (nowait==0) or not (nowait==!0).
starpu_omp_sections_combined
Alternative implementation of sections. Differ from #starpu_omp_sections in that all the sections are combined within a single function in this version. \p section_f is the function implementing the combined sections.
starpu_omp_set_default_device
Set the number of the device to use as default.
starpu_omp_set_dynamic
Enable (1) or disable (0) dynamically adjusting the number of parallel threads.
starpu_omp_set_lock
Lock an opaque lock object. If the lock is already locked, the function will block until it succeeds in exclusively acquiring the lock.
starpu_omp_set_max_active_levels
Set the maximum number of allowed active parallel section levels.
starpu_omp_set_nest_lock
Lock an opaque lock object supporting nested locking operations. If the lock is already locked by another task, the function will block until it succeeds in exclusively acquiring the lock. If the lock is already taken by the current task, the function will increase the nested locking level of the lock object.
starpu_omp_set_nested
Enable (1) or disable (0) nested parallel regions.
starpu_omp_set_num_threads
Set ICVS nthreads_var for the parallel regions to be created with the current region.
starpu_omp_set_schedule
Set the default scheduling kind for upcoming loops within the current parallel section. \p kind is the scheduler kind, \p modifier complements the scheduler kind with information such as the chunk size, in accordance with the OpenMP specification.
starpu_omp_shutdown
Shutdown StarPU and its OpenMP Runtime support. See \ref OMPInitExit for more details.
starpu_omp_single
Ensure that a single participating thread of the innermost OpenMP parallel region executes a function. \p f is the function to be executed by a single thread. \p arg is an argument passed to function \p f. \p nowait is a flag indicating whether an implicit barrier is requested after the single section (nowait==0) or not (nowait==!0).
starpu_omp_single_copyprivate
Execute \p f on a single task of the current parallel region task, and then broadcast the contents of the memory block pointed by the copyprivate pointer \p data and of size \p data_size to the corresponding \p data pointed memory blocks of all the other participating region tasks. This function can be used to implement #pragma omp single with a copyprivate clause.
starpu_omp_single_copyprivate_inline_begin
Elect one task among the tasks of the current parallel region task to execute the following single section, and then broadcast the copyprivate pointer \p data to all the other participating region tasks. This function can be used to implement #pragma omp single with a copyprivate clause without code outlining.
starpu_omp_single_copyprivate_inline_end
Complete the execution of a single section and return the broadcasted copyprivate pointer for tasks that lost the election and NULL for the task that won the election. This function can be used to implement #pragma omp single with a copyprivate clause without code outlining.
starpu_omp_single_inline
Decide whether the current thread is elected to run the following single section among the participating threads of the innermost OpenMP parallel region.
starpu_omp_task_region
Generate an explicit child task. The execution of the generated task is asynchronous with respect to the calling code unless specified otherwise. \p attr specifies the attributes for the generated task region.
starpu_omp_taskgroup
Launch a function and wait for the completion of every descendant task generated during the execution of the function.
starpu_omp_taskgroup_inline_begin
Launch a function and gets ready to wait for the completion of every descendant task generated during the dynamic scope of the taskgroup.
starpu_omp_taskgroup_inline_end
Wait for the completion of every descendant task generated during the dynamic scope of the taskgroup.
starpu_omp_taskloop_inline_begin
starpu_omp_taskloop_inline_end
starpu_omp_taskwait
Wait for the completion of the tasks generated by the current task. This function does not wait for the descendants of the tasks generated by the current task.
starpu_omp_test_lock
Unblockingly attempt to lock a lock object and return whether it succeeded or not.
starpu_omp_test_nest_lock
Unblocking attempt to lock an opaque lock object supporting nested locking operations and returns whether it succeeded or not. If the lock is already locked by another task, the function will return without having acquired the lock. If the lock is already taken by the current task, the function will increase the nested locking level of the lock object.
starpu_omp_unset_lock
Unlock a previously locked lock object. The behaviour of this function is unspecified if it is called on an unlocked lock object.
starpu_omp_unset_nest_lock
Unlock a previously locked lock object supporting nested locking operations. If the lock has been locked multiple times in nested fashion, the nested locking level is decreased and the lock remains locked. Otherwise, if the lock has only been locked once, it becomes unlocked. The behaviour of this function is unspecified if it is called on an unlocked lock object. The behaviour of this function is unspecified if it is called from a different task than the one that locked the lock object.
starpu_omp_vector_annotate
Enable setting additional vector metadata needed by the OpenMP Runtime Support.
starpu_opencl_allocate_memory
Allocate \p size bytes of memory, stored in \p addr. \p flags must be a valid combination of \c cl_mem_flags values. See \ref DefiningANewDataInterface_allocation for more details.
starpu_opencl_collect_stats
Collect statistics on a kernel execution. After termination of the kernels, the OpenCL codelet should call this function with the event returned by \c clEnqueueNDRangeKernel(), to let StarPU collect statistics about the kernel execution (used cycles, consumed energy). See \ref OpenCL-specificOptimizations for more details.
starpu_opencl_compile_opencl_from_file
Compile the OpenCL kernel stored in the file \p source_file_name with the given options \p build_options and store the result in the directory $STARPU_HOME/.starpu/opencl with the same filename as \p source_file_name. The compilation is done for every OpenCL device, and the filename is suffixed with the vendor id and the device id of the OpenCL device. See \ref OpenCLSupport for more details.
starpu_opencl_compile_opencl_from_string
Compile the OpenCL kernel in the string \p opencl_program_source with the given options \p build_options and store the result in the directory $STARPU_HOME/.starpu/opencl with the filename \p file_name. The compilation is done for every OpenCL device, and the filename is suffixed with the vendor id and the device id of the OpenCL device. See \ref OpenCLSupport for more details.
starpu_opencl_copy_async_sync
Copy \p size bytes from byte offset \p src_offset of \p src on \p src_node to byte offset \p dst_offset of \p dst on \p dst_node. if \p event is NULL, the copy is synchronous, i.e. the queue is synchronised before returning. If not NULL, \p event can be used after the call to wait for this particular copy to complete. The function returns -EAGAIN if the asynchronous launch was successful. It returns 0 if the synchronous copy was successful, or fails otherwise. See \ref DefiningANewDataInterface_copy for more details.
starpu_opencl_copy_opencl_to_opencl
Copy \p size bytes asynchronously from byte offset \p src_offset of \p src on OpenCL \p src_node to byte offset \p dst_offset of \p dst on OpenCL \p dst_node. if \p event is NULL, the copy is synchronous, i.e. the queue is synchronised before returning. If not NULL, \p event can be used after the call to wait for this particular copy to complete. This function returns CL_SUCCESS if the copy was successful, or a valid OpenCL error code otherwise. The integer pointed to by \p ret is set to -EAGAIN if the asynchronous launch was successful, or to 0 if \p event was NULL. See \ref DefiningANewDataInterface_copy for more details.
starpu_opencl_copy_opencl_to_ram
Copy \p size bytes asynchronously from the given \p buffer on OpenCL \p src_node to the given \p ptr on RAM \p dst_node. \p offset is the offset, in bytes, in \p buffer. if \p event is NULL, the copy is synchronous, i.e the queue is synchronised before returning. If not NULL, \p event can be used after the call to wait for this particular copy to complete. This function returns CL_SUCCESS if the copy was successful, or a valid OpenCL error code otherwise. The integer pointed to by \p ret is set to -EAGAIN if the asynchronous launch was successful, or to 0 if \p event was NULL. See \ref DefiningANewDataInterface_copy for more details.
starpu_opencl_copy_ram_to_opencl
Copy \p size bytes from the given \p ptr on RAM \p src_node to the given \p buffer on OpenCL \p dst_node. \p offset is the offset, in bytes, in \p buffer. if \p event is NULL, the copy is synchronous, i.e the queue is synchronised before returning. If not NULL, \p event can be used after the call to wait for this particular copy to complete. This function returns CL_SUCCESS if the copy was successful, or a valid OpenCL error code otherwise. The integer pointed to by \p ret is set to -EAGAIN if the asynchronous launch was successful, or to 0 if \p event was NULL. See \ref DefiningANewDataInterface_copy for more details.
starpu_opencl_display_error
Given a valid error status, print the corresponding error message on \c stdout, along with the function name \p func, the filename \p file, the line number \p line and the message \p msg. See \ref OpenCLSupport for more details.
starpu_opencl_error_string
Return the error message in English corresponding to \p status, an OpenCL error code. See \ref OpenCLSupport for more details.
starpu_opencl_get_context
Return the OpenCL context of the device designated by \p devid in \p context. See \ref OpenCLSupport for more details.
starpu_opencl_get_current_context
Return the context of the current worker. See \ref OpenCLSupport for more details.
starpu_opencl_get_current_queue
Return the computation kernel command queue of the current worker. See \ref OpenCLSupport for more details.
starpu_opencl_get_device
Return the cl_device_id corresponding to \p devid in \p device. See \ref OpenCLSupport for more details.
starpu_opencl_get_queue
Return the command queue of the device designated by \p devid into \p queue. See \ref OpenCLSupport for more details.
starpu_opencl_load_binary_opencl
Compile the binary OpenCL kernel identified with \p kernel_id. For every OpenCL device, the binary OpenCL kernel will be loaded from the file $STARPU_HOME/.starpu/opencl/<kernel_id>.<device_type>.vendor_id_<vendor_id>device_id<device_id>. See \ref OpenCLSupport for more details.
starpu_opencl_load_kernel
Create a kernel \p kernel for device \p devid, on its computation command queue returned in \p queue, using program \p opencl_programs and name \p kernel_name. See \ref OpenCLSupport for more details.
starpu_opencl_load_opencl_from_file
Compile an OpenCL source code stored in a file. See \ref OpenCLSupport for more details.
starpu_opencl_load_opencl_from_string
Compile an OpenCL source code stored in a string. See \ref OpenCLSupport for more details.
starpu_opencl_load_program_source
Store the contents of the file \p source_file_name in the buffer \p opencl_program_source. The file \p source_file_name can be located in the current directory, or in the directory specified by the environment variable \ref STARPU_OPENCL_PROGRAM_DIR, or in the directory share/starpu/opencl of the installation directory of StarPU, or in the source directory of StarPU. When the file is found, \p located_file_name is the full name of the file as it has been located on the system, \p located_dir_name the directory where it has been located. Otherwise, they are both set to the empty string. See \ref OpenCLSupport for more details.
starpu_opencl_load_program_source_malloc
Similar to function starpu_opencl_load_program_source() but allocate the buffers \p located_file_name, \p located_dir_name and \p opencl_program_source. See \ref OpenCLSupport for more details.
starpu_opencl_release_kernel
Release the given \p kernel, to be called after kernel execution. See \ref OpenCLSupport for more details.
starpu_opencl_set_kernel_args
Set the arguments of a given kernel. The list of arguments must be given as (size_t size_of_the_argument, cl_mem * pointer_to_the_argument). The last argument must be 0. Return the number of arguments that were successfully set. In case of failure, return the id of the argument that could not be set and \p err is set to the error returned by OpenCL. Otherwise, return the number of arguments that were set.
starpu_opencl_unload_opencl
Unload an OpenCL compiled code. See \ref OpenCLSupport for more details.
starpu_opencl_worker_get_count
Return the number of OpenCL devices controlled by StarPU. The return value should be at most \ref STARPU_MAXOPENCLDEVS. See \ref TopologyWorkers for more details.
starpu_parallel_task_barrier_init
Initialise the barrier for the parallel task, and dispatch the task between the different workers of the given combined worker. See \ref SchedulingHelpers for more details.
starpu_parallel_task_barrier_init_n
Initialise the barrier for the parallel task, to be pushed to \p worker_size workers (without having to explicit a given combined worker). See \ref SchedulingHelpers for more details.
starpu_parallel_worker_init
Create parallel_workers on the machine with the given parameters. See \ref CreatingParallel for more details.
starpu_parallel_worker_openmp_prologue
Prologue functions
starpu_parallel_worker_print
Print the given parallel_workers configuration. See \ref CreatingParallel for more details.
starpu_parallel_worker_shutdown
Delete the given parallel_workers configuration
starpu_pause
Suspend the processing of new tasks by workers. It can be used in a program where StarPU is used during only a part of the execution. Without this call, the workers continue to poll for new tasks in a tight loop, wasting CPU time. The symmetric call to starpu_resume() should be used to unfreeze the workers. See \ref KernelThreadsStartedByStarPU and \ref PauseResume for more details.
starpu_perf_counter_collection_start
Start collecting performance counter values.
starpu_perf_counter_collection_stop
Stop collecting performance counter values.
starpu_perf_counter_get_help_string
Return the counter’s help string.
starpu_perf_counter_get_type_id
Return the counter’s type id.
starpu_perf_counter_id_to_name
Translate a counter id to its name constant string.
starpu_perf_counter_list_all_avail
Display the list of counters defined in all scopes.
starpu_perf_counter_list_avail
Display the list of counters defined in the given scope.
starpu_perf_counter_listener_exit
End a performance counter listener.
starpu_perf_counter_listener_init
Initialize a new performance counter listener.
starpu_perf_counter_name_to_id
Translate a performance counter name to its id.
starpu_perf_counter_nb
Return the number of performance counters for the given scope.
starpu_perf_counter_nth_to_id
Translate a performance counter rank in its scope to its counter id.
starpu_perf_counter_sample_get_double_value
Read a double counter value from a sample.
starpu_perf_counter_sample_get_float_value
Read a float counter value from a sample.
starpu_perf_counter_sample_get_int32_value
Read an int32 counter value from a sample.
starpu_perf_counter_sample_get_int64_value
Read an int64 counter value from a sample.
starpu_perf_counter_scope_id_to_name
Translate scope id to scope name constant string.
starpu_perf_counter_scope_name_to_id
Translate scope name constant string to scope id.
starpu_perf_counter_set_all_per_worker_listeners
Set a common listener for all workers.
starpu_perf_counter_set_alloc
Allocate a new performance counter set.
starpu_perf_counter_set_disable_id
Disable a given counter in the set.
starpu_perf_counter_set_enable_id
Enable a given counter in the set.
starpu_perf_counter_set_free
Free a performance counter set.
starpu_perf_counter_set_global_listener
Set a listener for the global scope.
starpu_perf_counter_set_per_codelet_listener
Set a per_codelet listener for a codelet.
starpu_perf_counter_set_per_worker_listener
Set a listener for the per_worker scope on a given worker.
starpu_perf_counter_type_id_to_name
Translate type id to type name constant string.
starpu_perf_counter_type_name_to_id
Translate type name constant string to type id.
starpu_perf_counter_unset_all_per_worker_listeners
Unset all per_worker listeners.
starpu_perf_counter_unset_global_listener
Unset the global listener.
starpu_perf_counter_unset_per_codelet_listener
Unset a per_codelet listener.
starpu_perf_counter_unset_per_worker_listener
Unset the per_worker listener.
starpu_perf_knob_get_global_double_value
Get knob value for Global scope.
starpu_perf_knob_get_global_float_value
Get knob value for Global scope.
starpu_perf_knob_get_global_int32_value
Get knob value for Global scope.
starpu_perf_knob_get_global_int64_value
Get knob value for Global scope.
starpu_perf_knob_get_help_string
Return the knob’s help string.
starpu_perf_knob_get_per_scheduler_double_value
Get double value for per_scheduler scope.
starpu_perf_knob_get_per_scheduler_float_value
Get float value for per_scheduler scope.
starpu_perf_knob_get_per_scheduler_int32_value
Get int32 value for per_scheduler scope.
starpu_perf_knob_get_per_scheduler_int64_value
Get int64 value for per_scheduler scope.
starpu_perf_knob_get_per_worker_double_value
Get double value for Per_worker scope.
starpu_perf_knob_get_per_worker_float_value
Get float value for Per_worker scope.
starpu_perf_knob_get_per_worker_int32_value
Get int32 value for Per_worker scope.
starpu_perf_knob_get_per_worker_int64_value
Get int64 value for Per_worker scope.
starpu_perf_knob_get_type_id
Translate a knob id to its name constant string.
starpu_perf_knob_id_to_name
Translate a performance knob rank in its scope to its knob id.
starpu_perf_knob_list_all_avail
Display the list of knobs defined in all scopes.
starpu_perf_knob_list_avail
Display the list of knobs defined in the given scope.
starpu_perf_knob_name_to_id
Translate a performance knob name to its id.
starpu_perf_knob_nb
Return the number of performance steering knobs for the given scope.
starpu_perf_knob_nth_to_id
Translate a performance knob name to its id.
starpu_perf_knob_scope_id_to_name
Translate scope id to scope name constant string.
starpu_perf_knob_scope_name_to_id
Translate scope name constant string to scope id.
starpu_perf_knob_set_global_double_value
Set double knob value for Global scope.
starpu_perf_knob_set_global_float_value
Set float knob value for Global scope.
starpu_perf_knob_set_global_int32_value
Set int32 knob value for Global scope.
starpu_perf_knob_set_global_int64_value
Set int64 knob value for Global scope.
starpu_perf_knob_set_per_scheduler_double_value
Set double value for per_scheduler scope.
starpu_perf_knob_set_per_scheduler_float_value
Set float value for per_scheduler scope.
starpu_perf_knob_set_per_scheduler_int32_value
Set int32 value for per_scheduler scope.
starpu_perf_knob_set_per_scheduler_int64_value
Set int64 value for per_scheduler scope.
starpu_perf_knob_set_per_worker_double_value
Set double value for Per_worker scope.
starpu_perf_knob_set_per_worker_float_value
Set float value for Per_worker scope.
starpu_perf_knob_set_per_worker_int32_value
Set int32 value for Per_worker scope.
starpu_perf_knob_set_per_worker_int64_value
Set int64 value for Per_worker scope.
starpu_perf_knob_type_id_to_name
Translate type id to type name constant string.
starpu_perf_knob_type_name_to_id
Translate type name constant string to type id.
starpu_perfmodel_arch_comb_add
starpu_perfmodel_arch_comb_fetch
starpu_perfmodel_arch_comb_get
starpu_perfmodel_debugfilepath
Return the path to the debugging information for the performance model.
starpu_perfmodel_deinit
Deinitialize the \p model performance model structure. You need to call this before deallocating the structure. You will probably want to call starpu_perfmodel_unload_model() before calling this function, to save the perfmodel.
starpu_perfmodel_directory
Print the directory name storing performance models on \p output
starpu_perfmodel_dump_xml
Dump performance model \p model to output stream \p output, in XML format. See \ref PerformanceModelExample for more details.
starpu_perfmodel_free_sampling
Free internal memory used for sampling management. It should only be called by an application which is not calling starpu_shutdown() as this function already calls it. See for example tools/starpu_perfmodel_display.c.
starpu_perfmodel_get_arch_name
Return the architecture name for \p arch
starpu_perfmodel_get_archtype_name
starpu_perfmodel_get_model_path
Fills \p path (supposed to be \p maxlen long) with the full path to the performance model file for symbol \p symbol. This path can later on be used for instance with starpu_perfmodel_load_file() .
starpu_perfmodel_get_model_per_arch
starpu_perfmodel_get_model_per_devices
starpu_perfmodel_get_narch_combs
starpu_perfmodel_history_based_expected_perf
Return the estimated time in µs of a task with the given model and the given footprint.
starpu_perfmodel_init
Initialize the \p model performance model structure. This is automatically called when e.g. submitting a task using a codelet using this performance model.
starpu_perfmodel_initialize
If starpu_init() is not used, starpu_perfmodel_initialize() should be used called calling starpu_perfmodel_* functions.
starpu_perfmodel_list
Print a list of all performance models on \p output
starpu_perfmodel_list_combs
starpu_perfmodel_load_file
Load the performance model found in the file named \p filename. \p model has to be completely zero, and will be filled with the information stored in the given file.
starpu_perfmodel_load_symbol
Load a given performance model. \p model has to be completely zero, and will be filled with the information stored in $STARPU_HOME/.starpu. The function is intended to be used by external tools that want to read the performance model files.
starpu_perfmodel_print
starpu_perfmodel_print_all
starpu_perfmodel_print_estimations
starpu_perfmodel_set_per_devices_cost_function
starpu_perfmodel_set_per_devices_size_base
starpu_perfmodel_unload_model
Unload \p model which has been previously loaded through the function starpu_perfmodel_load_symbol()
starpu_perfmodel_update_history
Feed the performance model \p model with one explicit measurement (in µs or J), in addition to measurements done by StarPU itself. This can be useful when the application already has an existing set of measurements done in good conditions, that StarPU could benefit from instead of doing on-line measurements. An example of use can be seen in \ref PerformanceModelExample.
starpu_perfmodel_update_history_n
Feed the performance model \p model with an explicit average measurement (in µs or J).
starpu_prefetch_task_input_for
Prefetch data for a given p task on a given p worker. See \ref SchedulingHelpers for more details.
starpu_prefetch_task_input_for_prio
Prefetch data for a given p task on a given p worker with a given priority. See \ref SchedulingHelpers for more details.
starpu_prefetch_task_input_on_node
Prefetch data for a given p task on a given p node. See \ref SchedulingHelpers for more details.
starpu_prefetch_task_input_on_node_prio
Prefetch data for a given p task on a given p node with a given priority. See \ref SchedulingHelpers for more details.
starpu_profiling_bus_helper_display_summary
Display statistics about the bus on \c stderr. if the environment variable \ref STARPU_BUS_STATS is defined. The function is called automatically by starpu_shutdown(). See \ref DataStatistics for more details.
starpu_profiling_init
Reset performance counters and enable profiling if the environment variable \ref STARPU_PROFILING is set to a positive value. See \ref EnablingOn-linePerformanceMonitoring for more details.
starpu_profiling_set_id
Set the ID used for profiling trace filename. Has to be called before starpu_init(). See \ref TraceMpi for more details.
starpu_profiling_status_get
Return the current profiling status or a negative value in case there was an error. See \ref EnablingOn-linePerformanceMonitoring for more details.
starpu_profiling_status_set
Set the profiling status. Profiling is activated by passing \ref STARPU_PROFILING_ENABLE in \p status. Passing \ref STARPU_PROFILING_DISABLE disables profiling. Calling this function resets all profiling measurements. When profiling is enabled, the field starpu_task::profiling_info points to a valid structure starpu_profiling_task_info containing information about the execution of the task. Negative return values indicate an error, otherwise the previous status is returned. See \ref EnablingOn-linePerformanceMonitoring for more details.
starpu_profiling_worker_get_info
Get the profiling info associated to the worker identified by \p workerid, and reset the profiling measurements. If the argument \p worker_info is NULL, only reset the counters associated to worker \p workerid. Upon successful completion, this function returns 0. Otherwise, a negative value is returned. See \ref Per-workerFeedback for more details.
starpu_profiling_worker_helper_display_summary
Display statistic about the workers on \c stderr if the environment variable \ref STARPU_WORKER_STATS is defined. The function is called automatically by starpu_shutdown(). See \ref DataStatistics for more details.
starpu_progression_hook_deregister
Unregister a given progression hook.
starpu_progression_hook_register
Register a progression hook, to be called when workers are idle.
starpu_pthread_mutex_check_sched
starpu_pthread_mutex_lock_sched
starpu_pthread_mutex_trylock_sched
starpu_pthread_mutex_unlock_sched
starpu_pthread_spin_destroy
starpu_pthread_spin_init
starpu_pthread_spin_lock
starpu_pthread_spin_trylock
starpu_pthread_spin_unlock
starpu_push_local_task
The scheduling policy may put tasks directly into a worker’s local queue so that it is not always necessary to create its own queue when the local queue is sufficient. \p back is ignored: the task priority is used to order tasks in this queue. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_push_task_end
Must be called by a scheduler to notify that the given task has just been pushed. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_resume
Symmetrical call to starpu_pause(), used to resume the workers polling for new tasks. This would be typically called only once having submitted all tasks. See \ref KernelThreadsStartedByStarPU and \ref PauseResume for more details.
starpu_save_history_based_model
Save the performance model in its file.
starpu_sched_ctx_add_workers
Add dynamically the workers in \p workerids_ctx to the context \p sched_ctx_id. The last argument cannot be greater than ::STARPU_NMAX_SCHED_CTXS. See \ref ModifyingAContext for more details.
starpu_sched_ctx_bind_current_thread_to_cpuid
starpu_sched_ctx_book_workers_for_task
starpu_sched_ctx_contains_type_of_worker
starpu_sched_ctx_contains_worker
Return 1 if the worker belongs to the context and 0 otherwise
starpu_sched_ctx_create
Create a scheduling context with the given parameters (see below) and assign the workers in \p workerids_ctx to execute the tasks submitted to it. The return value represents the identifier of the context that has just been created. It will be further used to indicate the context the tasks will be submitted to. The return value should be at most ::STARPU_NMAX_SCHED_CTXS.
starpu_sched_ctx_create_inside_interval
Create a context indicating an approximate interval of resources
starpu_sched_ctx_create_worker_collection
Create a worker collection of the type indicated by the last parameter for the context specified through the first parameter.
starpu_sched_ctx_delete
Delete scheduling context \p sched_ctx_id and transfer remaining workers to the inheritor scheduling context. See \ref DeletingAContext for more details.
starpu_sched_ctx_delete_worker_collection
Delete the worker collection of the specified scheduling context
starpu_sched_ctx_display_workers
Print on the file \p f the worker names belonging to the context \p sched_ctx_id
starpu_sched_ctx_exec_parallel_code
Execute any parallel code on the workers of the sched_ctx (workers are blocked)
starpu_sched_ctx_finished_submit
Indicate starpu that the application finished submitting to this context in order to move the workers to the inheritor as soon as possible. See \ref DeletingAContext for more details.
starpu_sched_ctx_get_available_cpuids
starpu_sched_ctx_get_context
Return the scheduling context the tasks are currently submitted to, or ::STARPU_NMAX_SCHED_CTXS if no default context has been defined by calling the function starpu_sched_ctx_set_context().
starpu_sched_ctx_get_ctx_for_task
starpu_sched_ctx_get_hierarchy_level
starpu_sched_ctx_get_inheritor
starpu_sched_ctx_get_max_priority
Return the current maximum priority level supported by the scheduling policy of the given scheduler context.
starpu_sched_ctx_get_min_priority
Return the current minimum priority level supported by the scheduling policy of the given scheduler context.
starpu_sched_ctx_get_nready_flops
starpu_sched_ctx_get_nready_tasks
starpu_sched_ctx_get_nshared_workers
Return the number of workers shared by two contexts.
starpu_sched_ctx_get_nsms
starpu_sched_ctx_get_nworkers
Return the number of workers managed by the specified context (Usually needed to verify if it manages any workers or if it should be blocked)
starpu_sched_ctx_get_policy_data
Return the scheduling policy data (private information of the scheduler) of the contexts previously assigned to. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_sched_ctx_get_priority
starpu_sched_ctx_get_sched_policy
starpu_sched_ctx_get_sched_policy_callback
Return the function associated with the scheduler context \p sched_ctx_id which was given through the field starpu_conf::sched_policy_callback
starpu_sched_ctx_get_sms_interval
starpu_sched_ctx_get_stream_worker
starpu_sched_ctx_get_user_data
Return the user data pointer associated to the scheduling context.
starpu_sched_ctx_get_worker_collection
Return the worker collection managed by the indicated context
starpu_sched_ctx_get_worker_rank
starpu_sched_ctx_get_workers_list
Return the list of workers in the array \p workerids, the return value is the number of workers. The user should free the \p workerids table after finishing using it (it is allocated inside the function with the proper size)
starpu_sched_ctx_get_workers_list_raw
Return the list of workers in the array \p workerids, the return value is the number of workers. This list is provided in raw order, i.e. not sorted by tree or list order, and the user should not free the \p workerids table. This function is thus much less costly than starpu_sched_ctx_get_workers_list().
starpu_sched_ctx_has_starpu_scheduler
starpu_sched_ctx_list_task_counters_decrement
starpu_sched_ctx_list_task_counters_decrement_all_ctx_locked
starpu_sched_ctx_list_task_counters_increment
starpu_sched_ctx_list_task_counters_increment_all_ctx_locked
starpu_sched_ctx_list_task_counters_reset
starpu_sched_ctx_list_task_counters_reset_all
starpu_sched_ctx_master_get_context
Return the context id of masterid if it master of a context. If not, return ::STARPU_NMAX_SCHED_CTXS.
starpu_sched_ctx_max_priority_is_set
starpu_sched_ctx_min_priority_is_set
starpu_sched_ctx_move_task_to_ctx_locked
starpu_sched_ctx_overlapping_ctxs_on_worker
Check if a worker is shared between several contexts
starpu_sched_ctx_register_close_callback
Execute the callback whenever the last task of the context finished executing, it is called with the parameters \p sched_ctx and any other parameter needed by the application (packed in \p args)
starpu_sched_ctx_remove_workers
Remove the workers in \p workerids_ctx from the context \p sched_ctx_id. The last argument cannot be greater than ::STARPU_NMAX_SCHED_CTXS. See \ref ModifyingAContext for more details.
starpu_sched_ctx_revert_task_counters_ctx_locked
starpu_sched_ctx_set_context
Set the scheduling context the subsequent tasks will be submitted to. See \ref SubmittingTasksToAContext and \ref TmpCTXS for more details.
starpu_sched_ctx_set_inheritor
Indicate that the context \p inheritor will inherit the resources of the context \p sched_ctx_id when \p sched_ctx_id will be deleted. See \ref DeletingAContext for more details.
starpu_sched_ctx_set_max_priority
Define the maximum priority level supported by the scheduling policy of the given scheduler context. The default maximum priority level is 1. The application may access that value by calling the starpu_sched_ctx_get_max_priority() function. This function should only be called from the initialization method of the scheduling policy, and should not be used directly from the application.
starpu_sched_ctx_set_min_priority
Define the minimum task priority level supported by the scheduling policy of the given scheduler context. The default minimum priority level is the same as the default priority level which is 0 by convention. The application may access that value by calling the function starpu_sched_ctx_get_min_priority(). This function should only be called from the initialization method of the scheduling policy, and should not be used directly from the application.
starpu_sched_ctx_set_policy_data
Allocate the scheduling policy data (private information of the scheduler like queues, variables, additional condition variables) the context. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_sched_ctx_set_priority
starpu_sched_ctx_set_user_data
starpu_sched_ctx_stop_task_submission
Stop submitting tasks from the empty context list until the next time the context has time to check the empty context list. See \ref EmptyingAContext for more details.
starpu_sched_ctx_unbook_workers_for_task
starpu_sched_ctx_worker_get_id
Return the workerid if the worker belongs to the context and -1 otherwise. If the thread calling this function is not a worker the function returns -1 as it calls the function starpu_worker_get_id().
starpu_sched_ctx_worker_is_master_for_child_ctx
Return the first context (child of sched_ctx_id) where the workerid is master
starpu_sched_ctx_worker_shares_tasks_lists
The scheduling policies indicates if the worker may pop tasks from the list of other workers or if there is a central list with task for all the workers. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_sched_find_all_worker_combinations
See \ref SchedulingHelpers for more details.
starpu_sched_get_max_priority
TODO: check if this is correct Return the current maximum priority level supported by the scheduling policy. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_sched_get_min_priority
TODO: check if this is correct Return the current minimum priority level supported by the scheduling policy. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_sched_get_predefined_policies
Return an NULL-terminated array of all the predefined scheduling policies. See \ref TaskSchedulingPolicy for more details.
starpu_sched_get_sched_policy
Return the scheduler policy of the given context. See \ref TaskSchedulingPolicy for more details.
starpu_sched_get_sched_policy_in_ctx
Return the scheduler policy of the default context. See \ref TaskSchedulingPolicy for more details.
starpu_sched_set_max_priority
TODO: check if this is correct Define the maximum priority level supported by the scheduling policy. The default maximum priority level is 1. The application may access that value by calling the function starpu_sched_get_max_priority(). This function should only be called from the initialization method of the scheduling policy, and should not be used directly from the application. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_sched_set_min_priority
TODO: check if this is correct Define the minimum task priority level supported by the scheduling policy. The default minimum priority level is the same as the default priority level which is 0 by convention. The application may access that value by calling the function starpu_sched_get_min_priority(). This function should only be called from the initialization method of the scheduling policy, and should not be used directly from the application. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_sched_task_break
The scheduling policy should call this when it makes a scheduling decision for a task. This will possibly stop execution at this point, and then the programmer can inspect local variables etc. to determine why this scheduling decision was done.
starpu_sem_trywait
starpu_sem_wait
starpu_set_limit_max_submitted_tasks
Specify a maximum number of submitted tasks allowed at a given time, this allows to control the task submission flow. The value can also be specified with the environment variable \ref STARPU_LIMIT_MAX_SUBMITTED_TASKS. See \ref HowToReduceTheMemoryFootprintOfInternalDataStructures for more details.
starpu_set_limit_min_submitted_tasks
Specify a minimum number of submitted tasks allowed at a given time, this allows to control the task submission flow. The value can also be specified with the environment variable \ref STARPU_LIMIT_MIN_SUBMITTED_TASKS. See \ref HowToReduceTheMemoryFootprintOfInternalDataStructures for more details.
starpu_shutdown
StarPU termination method, must be called at the end of the application: statistics and other post-mortem debugging information are not guaranteed to be available until this method has been called. See \ref SubmittingATask for more details.
starpu_sleep
Sleep for the given \p nb_sec seconds. Similar to calling Unix’ \c sleep function, except that it takes a float to allow sub-second sleeping, and when StarPU is compiled in SimGrid mode it does not really sleep but just makes SimGrid record that the thread has taken some time to sleep. See \ref Helpers for more details.
starpu_tag_declare_deps
Specify the dependencies of the task identified by tag \p id. The first argument specifies the tag which is configured, the second argument gives the number of tag(s) on which \p id depends. The following arguments are the tags which have to be terminated to unlock the task. This function must be called before the associated task is submitted to StarPU with starpu_task_submit().
starpu_tag_declare_deps_array
Similar to starpu_tag_declare_deps(), except that its does not take a variable number of arguments but an \p array of tags of size \p ndeps.
starpu_tag_get_task
Return the task associated to the tag \p id. See \ref TasksAndTagsDependencies for more details.
starpu_tag_notify_from_apps
Explicitly unlock tag \p id. It may be useful in the case of applications which execute part of their computation outside StarPU tasks (e.g. third-party libraries). It is also provided as a convenient tool for the programmer, for instance to entirely construct the task DAG before actually giving StarPU the opportunity to execute the tasks. When called several times on the same tag, notification will be done only on first call, thus implementing “OR” dependencies, until the tag is restarted using starpu_tag_restart(). See \ref TasksAndTagsDependencies for more details.
starpu_tag_notify_restart_from_apps
Atomically call starpu_tag_notify_from_apps() and starpu_tag_restart() on tag \p id. This is useful with cyclic graphs, when we want to safely trigger its startup. See \ref TasksAndTagsDependencies for more details.
starpu_tag_remove
Release the resources associated to tag \p id. It can be called once the corresponding task has been executed and when there is no other tag that depend on this tag anymore. See \ref TasksAndTagsDependencies for more details.
starpu_tag_restart
Clear the already notified status of a tag which is not associated with a task. Before that, calling starpu_tag_notify_from_apps() again will not notify the successors. After that, the next call to starpu_tag_notify_from_apps() will notify the successors. See \ref TasksAndTagsDependencies for more details.
starpu_tag_wait
Block until the task associated to tag \p id has been executed. This is a blocking call which must therefore not be called within tasks or callbacks, but only from the application directly. It is possible to synchronize with the same tag multiple times, as long as the starpu_tag_remove() function is not called. Note that it is still possible to synchronize with a tag associated to a task for which the structure starpu_task was freed (e.g. if the field starpu_task::destroy was enabled). See \ref WaitingForTasks for more details.
starpu_tag_wait_array
Similar to starpu_tag_wait() except that it blocks until all the \p ntags tags contained in the array \p id are terminated. See \ref WaitingForTasks for more details.
starpu_task_build
Create a task corresponding to \p cl with the following arguments. The argument list must be zero-terminated. The arguments following the codelet are the same as the ones for the function starpu_task_insert(). If some arguments of type ::STARPU_VALUE are given, the parameter starpu_task::cl_arg_free will be set to 1. See \ref OtherTaskUtility for more details.
starpu_task_bundle_close
Inform the runtime that the user will not modify \p bundle anymore, it means no more inserting or removing task. Thus the runtime can destroy it when possible.
starpu_task_bundle_create
Factory function creating and initializing \p bundle, when the call returns, memory needed is allocated and \p bundle is ready to use.
starpu_task_bundle_expected_data_transfer_time
Return the time (in micro-seconds) expected to transfer all data used within \p bundle.
starpu_task_bundle_expected_energy
Return the expected energy consumption of \p bundle in J.
starpu_task_bundle_expected_length
Return the expected duration of \p bundle in micro-seconds.
starpu_task_bundle_insert
Insert \p task in \p bundle. Until \p task is removed from \p bundle its expected length and data transfer time will be considered along those of the other tasks of bundle. This function must not be called if \p bundle is already closed and/or \p task is already submitted. On success, it returns 0. There are two cases of error : if \p bundle is already closed it returns -EPERM, if \p task was already submitted it returns -EINVAL.
starpu_task_bundle_remove
Remove \p task from \p bundle. Of course \p task must have been previously inserted in \p bundle. This function must not be called if \p bundle is already closed and/or \p task is already submitted. Doing so would result in undefined behaviour. On success, it returns 0. If \p bundle is already closed it returns -ENOENT.
starpu_task_clean
Release all the structures automatically allocated to execute \p task, but not the task structure itself and values set by the user remain unchanged. It is thus useful for statically allocated tasks for instance. It is also useful when users want to execute the same operation several times with as least overhead as possible. It is called automatically by starpu_task_destroy(). It has to be called only after explicitly waiting for the task or after starpu_shutdown() (waiting for the callback is not enough, since StarPU still manipulates the task after calling the callback). See \ref PerformanceModelCalibration for more details.
starpu_task_create
Allocate a task structure and initialize it with default values. Tasks allocated dynamically with starpu_task_create() are automatically freed when the task is terminated. This means that the task pointer can not be used any more once the task is submitted, since it can be executed at any time (unless dependencies make it wait) and thus freed at any time. If the field starpu_task::destroy is explicitly unset, the resources used by the task have to be freed by calling starpu_task_destroy(). See \ref SubmittingATask for more details.
starpu_task_create_sync
Allocate a task structure that does nothing but accesses data \p handle with mode \p mode. This allows to synchronize with the task graph, according to the sequential consistency, against tasks submitted before or after submitting this task. One can then use starpu_task_declare_deps_array() or starpu_task_end_dep_add() / starpu_task_end_dep_release() to add dependencies against this task before submitting it. See \ref SynchronizationTasks for more details.
starpu_task_data_footprint
Return the raw footprint for the data of a given task (without taking into account user-provided functions). See \ref PerformanceModelExample for more details.
starpu_task_declare_deps
Declare task dependencies between a \p task and an series of \p ndeps tasks, similarly to starpu_task_declare_deps_array(), but the tasks are passed after \p ndeps, which indicates how many tasks \p task shall be made to depend on. If \p ndeps is 0, no dependency is added. See \ref TasksAndTagsDependencies for more details.
starpu_task_declare_deps_array
Declare task dependencies between a \p task and an array of tasks of length \p ndeps. This function must be called prior to the submission of the task, but it may called after the submission or the execution of the tasks in the array, provided the tasks are still valid (i.e. they were not automatically destroyed). Calling this function on a task that was already submitted or with an entry of \p task_array that is no longer a valid task results in an undefined behaviour. If \p ndeps is 0, no dependency is added. It is possible to call starpu_task_declare_deps_array() several times on the same task, in this case, the dependencies are added. It is possible to have redundancy in the task dependencies. See \ref TasksAndTagsDependencies for more details.
starpu_task_declare_end_deps
Declare task end dependencies between a \p task and an series of \p ndeps tasks, similarly to starpu_task_declare_end_deps_array(), but the tasks are passed after \p ndeps, which indicates how many tasks \p task ’s termination shall be made to depend on. If \p ndeps is 0, no dependency is added. See \ref TasksAndTagsDependencies for more details.
starpu_task_declare_end_deps_array
Declare task end dependencies between a \p task and an array of tasks of length \p ndeps. \p task will appear as terminated not only when \p task is termination, but also when the tasks of \p task_array have terminated. This function must be called prior to the termination of the task, but it may called after the submission or the execution of the tasks in the array, provided the tasks are still valid (i.e. they were not automatically destroyed). Calling this function on a task that was already terminated or with an entry of \p task_array that is no longer a valid task results in an undefined behaviour. If \p ndeps is 0, no dependency is added. It is possible to call starpu_task_declare_end_deps_array() several times on the same task, in this case, the dependencies are added. It is currently not implemented to have redundancy in the task dependencies. See \ref TasksAndTagsDependencies for more details.
starpu_task_destroy
Free the resource allocated during starpu_task_create() and associated with \p task. This function is called automatically after the execution of a task when the field starpu_task::destroy is set, which is the default for tasks created by starpu_task_create(). Calling this function on a statically allocated task results in an undefined behaviour. See \ref Per-taskFeedback and \ref PerformanceModelExample for more details.
starpu_task_dup
Allocate a task structure which is the exact duplicate of \p task. See \ref OtherTaskUtility for more details.
starpu_task_end_dep_add
Add \p nb_deps end dependencies to the task \p t. This means the task will not terminate until the required number of calls to the function starpu_task_end_dep_release() has been made. See \ref TasksAndTagsDependencies for more details.
starpu_task_end_dep_release
Unlock 1 end dependency to the task \p t. This function must be called after starpu_task_end_dep_add(). See \ref TasksAndTagsDependencies for more details.
starpu_task_expected_conversion_time
Return expected conversion time in ms (multiformat interface only). See \ref SchedulingHelpers for more details.
starpu_task_expected_data_transfer_time
Return expected data transfer time in micro-seconds for the given \p memory_node. Prefer using starpu_task_expected_data_transfer_time_for() which is more precise. See \ref SchedulingHelpers for more details.
starpu_task_expected_data_transfer_time_for
Return expected data transfer time in micro-seconds for the given \p worker. See \ref SchedulingHelpers for more details.
starpu_task_expected_energy
Return expected energy use in J. See \ref SchedulingHelpers for more details.
starpu_task_expected_energy_average
Return expected task energy use in J, averaged over the different workers driven by the scheduler \p sched_ctx_id Note: this is not just the average of the energy uses using the number of processing units as coefficients, but their efficiency at processing the task, thus the harmonic average of the energy uses. See \ref SchedulingHelpers for more details.
starpu_task_expected_length
Return expected task duration in micro-seconds on a given architecture \p arch using given implementation \p nimpl. See \ref SchedulingHelpers for more details.
starpu_task_expected_length_average
Return expected task duration in micro-seconds, averaged over the different workers driven by the scheduler \p sched_ctx_id Note: this is not just the average of the durations using the number of processing units as coefficients, but their efficiency at processing the task, thus the harmonic average of the durations. See \ref SchedulingHelpers for more details.
starpu_task_finished
Return 1 if \p task is terminated. See \ref WaitingForTasks for more details.
starpu_task_footprint
Return the footprint for a given task, taking into account user-provided perfmodel footprint or size_base functions. See \ref PerformanceModelExample for more details.
starpu_task_ft_create_retry
Create a try-task for a \p meta_task, given a \p template_task task template. The meta task can be passed as template on the first call, but since it is mangled by starpu_task_ft_create_retry(), further calls (typically made by the check_ft callback) need to be passed the previous try-task as template task.
starpu_task_ft_failed
Record that this task failed, and should thus be retried. This is usually called from the task codelet function itself, after checking the result and noticing that the computation went wrong, and thus the task should be retried. The performance of this task execution will not be recorded for performance models.
starpu_task_ft_prologue
Function to be used as a prologue callback to enable fault tolerance for the task. This prologue will create a try-task, i.e a duplicate of the task, which will to the actual computation.
starpu_task_ft_success
Notify that the try-task was successful and thus the meta-task was successful. See \ref TaskRetry for more details.
starpu_task_get_current
Return the task currently executed by the worker, or NULL if it is called either from a thread that is not a task or simply because there is no task being executed at the moment. See \ref Per-taskFeedback for more details.
starpu_task_get_current_data_node
Return the memory node number of parameter \p i of the task currently executed, or -1 if it is called either from a thread that is not a task or simply because there is no task being executed at the moment.
starpu_task_get_implementation
Return the codelet implementation to be executed when executing \p task. See \ref SchedulingHelpers for more details.
starpu_task_get_job_id
Return the job identifier associated with the task. See \ref TraceSchedTaskDetails for more details.
starpu_task_get_model_name
Return the name of the performance model of \p task. See \ref PerformanceModelExample for more details.
starpu_task_get_name
Return the name of \p task, i.e. either its starpu_task::name field, or the name of the corresponding performance model. See \ref TraceTaskDetails for more details.
starpu_task_get_task_scheduled_succs
Behave like starpu_task_get_task_succs(), except that it only reports tasks which will go through the scheduler, thus avoiding tasks with not codelet, or with explicit placement. See \ref GettingTaskChildren for more details.
starpu_task_get_task_succs
Fill \p task_array with the list of tasks which are direct children of \p task. \p ndeps is the size of \p task_array. This function returns the number of direct children. \p task_array can be set to NULL if \p ndeps is 0, which allows to compute the number of children before allocating an array to store them. This function can only be called if \p task has not completed yet, otherwise the results are undefined. The result may also be outdated if some additional dependency has been added in the meanwhile. See \ref GettingTaskChildren for more details.
starpu_task_init
Initialize \p task with default values. This function is implicitly called by starpu_task_create(). By default, tasks initialized with starpu_task_init() must be deinitialized explicitly with starpu_task_clean(). Tasks can also be initialized statically, using ::STARPU_TASK_INITIALIZER. See \ref PerformanceModelCalibration for more details.
starpu_task_insert
Create and submit a task corresponding to \p cl with the following given arguments. The argument list must be zero-terminated.
starpu_task_insert_data_make_room
Assuming that there are already \p current_buffer data handles passed to the task, and if *allocated_buffers is not 0, the task->dyn_handles array has size \p *allocated_buffers, this function makes room for \p room other data handles, allocating or reallocating task->dyn_handles as necessary and updating \p allocated_buffers accordingly. One can thus start with allocated_buffers equal to 0 and current_buffer equal to 0, then make room by calling this function, then store handles with STARPU_TASK_SET_HANDLE(), make room again with this function, store yet more handles, etc. See \ref OtherTaskUtility for more details.
starpu_task_insert_data_process_arg
Store data handle \p handle into task \p task with mode \p arg_type, updating \p *allocated_buffers and \p *current_buffer accordingly. See \ref OtherTaskUtility for more details.
starpu_task_insert_data_process_array_arg
Store \p nb_handles data handles \p handles into task \p task, updating \p *allocated_buffers and \p *current_buffer accordingly. See \ref OtherTaskUtility for more details.
starpu_task_insert_data_process_mode_array_arg
Store \p nb_descrs data handles described by \p descrs into task \p task, updating \p *allocated_buffers and \p *current_buffer accordingly. See \ref OtherTaskUtility for more details.
starpu_task_list_back
Get the back of \p list (without removing it). See \ref SchedulingHelpers for more details.
starpu_task_list_begin
Get the first task of \p list. See \ref SchedulingHelpers for more details.
starpu_task_list_empty
Test if \p list is empty. See \ref SchedulingHelpers for more details.
starpu_task_list_end
Get the end of \p list. See \ref SchedulingHelpers for more details.
starpu_task_list_erase
Remove \p task from \p list. See \ref SchedulingHelpers for more details.
starpu_task_list_front
Get the front of \p list (without removing it). See \ref SchedulingHelpers for more details.
starpu_task_list_init
Initialize a list structure. See \ref SchedulingHelpers for more details.
starpu_task_list_ismember
Test whether the given task \p look is contained in the \p list. See \ref SchedulingHelpers for more details.
starpu_task_list_move
Move list from one head \p lsrc to another \p ldst. See \ref SchedulingHelpers for more details.
starpu_task_list_next
Get the next task of \p list. This is not erase-safe. See \ref SchedulingHelpers for more details.
starpu_task_list_pop_back
Remove the element at the back of \p list. See \ref SchedulingHelpers for more details.
starpu_task_list_pop_front
Remove the element at the front of \p list. See \ref SchedulingHelpers for more details.
starpu_task_list_push_back
Push \p task at the back of \p list. See \ref SchedulingHelpers for more details.
starpu_task_list_push_front
Push \p task at the front of \p list. See \ref SchedulingHelpers for more details.
starpu_task_notify_ready_soon_register
Register a callback to be called when it is determined when a task will be ready an estimated amount of time from now, because its last dependency has just started and we know how long it will take. See \ref SchedulingHelpers for more details.
starpu_task_nready
Return the number of submitted tasks which are ready for execution are already executing. It thus does not include tasks waiting for dependencies. See \ref WaitingForTasks for more details.
starpu_task_nsubmitted
Return the number of submitted tasks which have not completed yet. See \ref WaitingForTasks for more details.
starpu_task_set
Set the given \p task corresponding to \p cl with the following arguments. The argument list must be zero-terminated. The arguments following the codelet are the same as the ones for the function starpu_task_insert(). If some arguments of type ::STARPU_VALUE are given, the parameter starpu_task::cl_arg_free will be set to 1. See \ref OtherTaskUtility for more details.
starpu_task_set_destroy
Tell StarPU to free the resources associated with \p task when the task is over. This is equivalent to having set task->destroy = 1 before submission, the difference is that this can be called after submission and properly deals with concurrency with the task execution. See \ref WaitingForTasks for more details.
starpu_task_set_implementation
This function should be called by schedulers to specify the codelet implementation to be executed when executing \p task. See \ref SchedulingHelpers for more details.
starpu_task_status_get_as_string
Return the given status as a string
starpu_task_submit
Submit \p task to StarPU. Calling this function does not mean that the task will be executed immediately as there can be data or task (tag) dependencies that are not fulfilled yet: StarPU will take care of scheduling this task with respect to such dependencies. This function returns immediately if the field starpu_task::synchronous is set to 0, and block until the termination of the task otherwise. It is also possible to synchronize the application with asynchronous tasks by the means of tags, using the function starpu_tag_wait() function for instance. In case of success, this function returns 0, a return value of -ENODEV means that there is no worker able to process this task (e.g. there is no GPU available and this task is only implemented for CUDA devices). starpu_task_submit() can be called from anywhere, including codelet functions and callbacks, provided that the field starpu_task::synchronous is set to 0. See \ref SubmittingATask for more details.
starpu_task_submit_nodeps
Submit \p task to StarPU with dependency bypass.
starpu_task_submit_to_ctx
Submit \p task to the context \p sched_ctx_id. By default, starpu_task_submit() submits the task to a global context that is created automatically by StarPU. See \ref SubmittingTasksToAContext for more details.
starpu_task_wait
Block until \p task has been executed. It is not possible to synchronize with a task more than once. It is not possible to wait for synchronous or detached tasks. Upon successful completion, this function returns 0. Otherwise, -EINVAL indicates that the specified task was either synchronous or detached. See \ref SubmittingATask for more details.
starpu_task_wait_array
Allow to wait for an array of tasks. Upon successful completion, this function returns 0. Otherwise, -EINVAL indicates that one of the tasks was either synchronous or detached. See \ref WaitingForTasks for more details.
starpu_task_wait_for_all
Block until all the tasks that were submitted (to the current context or the global one if there is no current context) are terminated. It does not destroy these tasks. See \ref SubmittingATask for more details.
starpu_task_wait_for_all_in_ctx
Wait until all the tasks that were already submitted to the context \p sched_ctx_id have been terminated. See \ref WaitingForTasks for more details.
starpu_task_wait_for_n_submitted
Block until there are \p n submitted tasks left (to the current context or the global one if there is no current context) to be executed. It does not destroy these tasks. See \ref HowtoReuseMemory for more details.
starpu_task_wait_for_n_submitted_in_ctx
Wait until there are \p n tasks submitted left to be executed that were already submitted to the context \p sched_ctx_id. See \ref WaitingForTasks for more details.
starpu_task_wait_for_no_ready
Wait until there is no more ready task. See \ref WaitingForTasks for more details.
starpu_task_watchdog_set_hook
Set the function to call when the watchdog detects that StarPU has not finished any task for \ref STARPU_WATCHDOG_TIMEOUT seconds. See \ref WatchdogSupport for more details.
starpu_task_worker_expected_energy
Same as starpu_task_expected_energy but for a precise worker. See \ref SchedulingHelpers for more details.
starpu_task_worker_expected_length
Same as starpu_task_expected_length() but for a precise worker. See \ref SchedulingHelpers for more details.
starpu_tcpip_ms_worker_get_count
Return the number of TCPIP Master Slave workers controlled by StarPU. See \ref TopologyWorkers for more details.
starpu_tensor_data_register
Register the \p nx x \p ny x \p nz x \p nt 4D tensor of \p elemsize byte elements pointed by \p ptr and initialize \p handle to represent it. Again, \p ldy, \p ldz, and \p ldt specify the number of elements between rows, between z planes and between t cubes.
starpu_tensor_filter_block
Partition a tensor along the X dimension, thus getting (x/\p nparts ,y,z,t) tensors. If \p nparts does not divide x, the last submatrix contains the remainder.
starpu_tensor_filter_block_shadow
Partition a tensor along the X dimension, with a shadow border starpu_data_filter::filter_arg_ptr, thus getting ((x-2shadow)/\p nparts +2shadow,y,z,t) tensors. If \p nparts does not divide x, the last submatrix contains the remainder.
starpu_tensor_filter_depth_block
Partition a tensor along the Z dimension, thus getting (x,y,z/\p nparts,t) tensors. If \p nparts does not divide z, the last submatrix contains the remainder.
starpu_tensor_filter_depth_block_shadow
Partition a tensor along the Z dimension, with a shadow border starpu_data_filter::filter_arg_ptr, thus getting (x,y,(z-2shadow)/\p nparts +2shadow,t) tensors. If \p nparts does not divide z, the last submatrix contains the remainder.
starpu_tensor_filter_pick_block_child_ops
Return the child_ops of the partition obtained with starpu_tensor_filter_pick_block_t(), starpu_tensor_filter_pick_block_z() and starpu_tensor_filter_pick_block_y(). See \ref TensorDataInterface for more details.
starpu_tensor_filter_pick_block_t
Pick \p nparts contiguous blocks from a tensor along the T dimension. The starting position on T-axis is set in starpu_data_filter::filter_arg_ptr.
starpu_tensor_filter_pick_block_y
Pick \p nparts contiguous blocks from a tensor along the Y dimension. The starting position on Y-axis is set in starpu_data_filter::filter_arg_ptr.
starpu_tensor_filter_pick_block_z
Pick \p nparts contiguous blocks from a tensor along the Z dimension. The starting position on Z-axis is set in starpu_data_filter::filter_arg_ptr.
starpu_tensor_filter_pick_variable
Pick \p nparts contiguous variables from a tensor. The starting position is set in starpu_data_filter::filter_arg_ptr.
starpu_tensor_filter_pick_variable_child_ops
Return the child_ops of the partition obtained with starpu_tensor_filter_pick_variable(). See \ref TensorDataInterface for more details.
starpu_tensor_filter_time_block
Partition a tensor along the T dimension, thus getting (x,y,z,t/\p nparts) tensors. If \p nparts does not divide t, the last submatrix contains the remainder.
starpu_tensor_filter_time_block_shadow
Partition a tensor along the T dimension, with a shadow border starpu_data_filter::filter_arg_ptr, thus getting (x,y,z,(t-2shadow)/\p nparts +2shadow) tensors. If \p nparts does not divide t, the last submatrix contains the remainder.
starpu_tensor_filter_vertical_block
Partition a tensor along the Y dimension, thus getting (x,y/\p nparts ,z,t) tensors. If \p nparts does not divide y, the last submatrix contains the remainder.
starpu_tensor_filter_vertical_block_shadow
Partition a tensor along the Y dimension, with a shadow border starpu_data_filter::filter_arg_ptr, thus getting (x,(y-2shadow)/\p nparts +2shadow,z,t) tensors. If \p nparts does not divide y, the last submatrix contains the remainder.
starpu_tensor_get_elemsize
Return the size of the elements of the tensor designated by \p handle.
starpu_tensor_get_local_ldt
Return the number of elements between each t cubes of the tensor designated by \p handle, in the format of the current memory node.
starpu_tensor_get_local_ldy
Return the number of elements between each row of the tensor designated by \p handle, in the format of the current memory node.
starpu_tensor_get_local_ldz
Return the number of elements between each z plane of the tensor designated by \p handle, in the format of the current memory node.
starpu_tensor_get_local_ptr
Return the local pointer associated with \p handle.
starpu_tensor_get_nt
Return the number of elements on the t-axis of the tensor designated by \p handle.
starpu_tensor_get_nx
Return the number of elements on the x-axis of the tensor designated by \p handle.
starpu_tensor_get_ny
Return the number of elements on the y-axis of the tensor designated by \p handle.
starpu_tensor_get_nz
Return the number of elements on the z-axis of the tensor designated by \p handle.
starpu_tensor_ptr_register
Register into the \p handle that to store data on node \p node it should use the buffer located at \p ptr, or device handle \p dev_handle and offset \p offset (for OpenCL, notably), with \p ldy elements between rows, and \p ldz elements between z planes, and \p ldt elements between t cubes.
starpu_timing_now
Return the current date in micro-seconds. See \ref Preparing for more details.
starpu_timing_timespec_delay_us
Return the time elapsed between \p start and \p end in microseconds. See \ref Per-taskFeedback for more details.
starpu_timing_timespec_to_us
Convert the given timespec \p ts into microseconds. See \ref Per-taskFeedback for more details.
starpu_topology_print
Print a description of the topology on \p f. See \ref ConfigurationAndInitialization for more details.
starpu_transaction_close
Function to mark the end of the last transaction epoch and free the transaction object. See \ref TransactionsClosing for more details.
starpu_transaction_next_epoch
Function to mark the end of the current transaction epoch and start a new epoch. See \ref TransactionsEpochNext for more details.
starpu_transaction_open
Function to open a new transaction object and start the first transaction epoch.
starpu_transfer_bandwidth
Return the bandwidth of data transfer between two memory nodes. See \ref SchedulingHelpers for more details.
starpu_transfer_latency
Return the latency of data transfer between two memory nodes. See \ref SchedulingHelpers for more details.
starpu_transfer_predict
Return the estimated time to transfer a given size between two memory nodes. See \ref SchedulingHelpers for more details.
starpu_tree_free
starpu_tree_get
starpu_tree_get_neighbour
starpu_tree_insert
starpu_tree_prepare_children
starpu_tree_reset_visited
starpu_uncluster_machine
@deprecated Use starpu_parallel_worker_shutdown()
starpu_usleep
Sleep for the given \p nb_micro_sec micro-seconds. In simgrid mode, this only sleeps within virtual time. See \ref Helpers for more details.
starpu_variable_data_register
Register the \p size byte element pointed to by \p ptr, which is typically a scalar, and initialize \p handle to represent this data item.
starpu_variable_get_elemsize
Return the size of the variable designated by \p handle.
starpu_variable_get_local_ptr
Return a pointer to the variable designated by \p handle.
starpu_variable_ptr_register
Register into the \p handle that to store data on node \p node it should use the buffer located at \p ptr, or device handle \p dev_handle and offset \p offset (for OpenCL, notably)
starpu_vector_data_register
Register the \p nx \p elemsize-byte elements pointed to by \p ptr and initialize \p handle to represent it.
starpu_vector_data_register_allocsize
Similar to starpu_vector_data_register, but additionally specifies which allocation size should be used instead of the initial nx*elemsize. See \ref VariableSizeDataInterface for more details.
starpu_vector_filter_block
Return in \p child_interface the \p id th element of the vector represented by \p father_interface once partitioned in \p nparts chunks of equal size.
starpu_vector_filter_block_shadow
Return in \p child_interface the \p id th element of the vector represented by \p father_interface once partitioned in \p nparts chunks of equal size with a shadow border starpu_data_filter::filter_arg_ptr, thus getting a vector of size (n-2shadow)/nparts+2shadow. The starpu_data_filter::filter_arg_ptr field of \p f must be the shadow size casted into \c void*.
starpu_vector_filter_divide_in_2
Return in \p child_interface the \p id th element of the vector represented by \p father_interface once partitioned in 2 chunks of equal size, ignoring nparts. Thus, \p id must be 0 or 1.
starpu_vector_filter_list
Return in \p child_interface the \p id th element of the vector represented by \p father_interface once partitioned into \p nparts chunks according to the starpu_data_filter::filter_arg_ptr field of \p f. The starpu_data_filter::filter_arg_ptr field must point to an array of \p nparts uint32_t elements, each of which specifies the number of elements in each chunk of the partition.
starpu_vector_filter_list_long
Return in \p child_interface the \p id th element of the vector represented by \p father_interface once partitioned into \p nparts chunks according to the starpu_data_filter::filter_arg_ptr field of \p f. The starpu_data_filter::filter_arg_ptr field must point to an array of \p nparts long elements, each of which specifies the number of elements in each chunk of the partition.
starpu_vector_filter_pick_variable
Pick \p nparts contiguous variables from a vector. The starting position is set in starpu_data_filter::filter_arg_ptr.
starpu_vector_filter_pick_variable_child_ops
Return the child_ops of the partition obtained with starpu_vector_filter_pick_variable(). See \ref VectorDataInterface for more details.
starpu_vector_get_allocsize
Return the allocated size of the array designated by \p handle.
starpu_vector_get_elemsize
Return the size of each element of the array designated by \p handle.
starpu_vector_get_local_ptr
Return the local pointer associated with \p handle.
starpu_vector_get_nx
Return the number of elements registered into the array designated by \p handle.
starpu_vector_ptr_register
Register into the \p handle that to store data on node \p node it should use the buffer located at \p ptr, or device handle \p dev_handle and offset \p offset (for OpenCL, notably)
starpu_void_data_register
Register a void interface. There is no data really associated to that interface, but it may be used as a synchronization mechanism. It also permits to express an abstract piece of data that is managed by the application internally: this makes it possible to forbid the concurrent execution of different tasks accessing the same void data in read-write concurrently. See \ref DataHandlesHelpers for more details.
starpu_wait_initialized
Wait for starpu_init() call to finish. See \ref ConfigurationAndInitialization for more details.
starpu_wake_all_blocked_workers
Wake all the workers, so they can inspect data requests and task submissions again.
starpu_wake_worker_locked
Version of starpu_wake_worker_no_relax() which assumes that the sched mutex is locked See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_wake_worker_no_relax
Must be called to wake up a worker that is sleeping on the cond. Return 0 whenever the worker is not in a sleeping state or has the state_keep_awake flag on. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_wake_worker_relax
Wake up \p workerid while temporarily entering the current worker relax state if needed during the waiting process. Return 1 if \p workerid has been woken up or its state_keep_awake flag has been set to \c 1, and \c 0 otherwise (if \p workerid was not in the STATE_SLEEPING or in the STATE_SCHEDULING). See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_wake_worker_relax_light
Light version of starpu_wake_worker_relax() which, when possible, speculatively set keep_awake on the target worker without waiting for the worker to enter the relax state. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_worker_archtype_is_valid
Return true if type matches one of StarPU’s defined worker architectures. See \ref TopologyWorkers for more details.
starpu_worker_can_execute_task
Check if the worker specified by workerid can execute the codelet. Schedulers need to call it before assigning a task to a worker, otherwise the task may fail to execute. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_worker_can_execute_task_first_impl
Check if the worker specified by workerid can execute the codelet and return the first implementation which can be used. Schedulers need to call it before assigning a task to a worker, otherwise the task may fail to execute. This should be preferred rather than calling starpu_worker_can_execute_task() for each and every implementation. It can also be used with impl_mask == NULL to check for at least one implementation without determining which. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_worker_can_execute_task_impl
Check if the worker specified by workerid can execute the codelet and return which implementation numbers can be used. Schedulers need to call it before assigning a task to a worker, otherwise the task may fail to execute. This should be preferred rather than calling starpu_worker_can_execute_task() for each and every implementation. It can also be used with impl_mask == NULL to check for at least one implementation without determining which. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_worker_display_all
Display on \p output the list (if any) of all workers. See \ref TopologyWorkers for more details.
starpu_worker_display_count
Display on \p output the number of workers of the given \p type. See \ref TopologyWorkers for more details.
starpu_worker_display_names
Display on \p output the list (if any) of all the workers of the given \p type. See \ref TopologyWorkers for more details.
starpu_worker_get_bindid
See \ref TopologyWorkers for more details.
starpu_worker_get_by_devid
Return the identifier of the worker that has the specified \p type and device id \p devid (which may not be the n-th, if some devices are skipped for instance). If there is no such worker, \c -1 is returned. See \ref TopologyWorkers for more details.
starpu_worker_get_by_type
Return the identifier of the \p num -th worker that has the specified \p type. If there is no such worker, -1 is returned. See \ref TopologyWorkers for more details.
starpu_worker_get_count
Return the number of workers (i.e. processing units executing StarPU tasks). The return value should be at most \ref STARPU_NMAXWORKERS. See \ref TopologyWorkers for more details.
starpu_worker_get_count_by_type
Return the number of workers of \p type. A positive (or NULL) value is returned in case of success, -EINVAL indicates that \p type is not valid otherwise. See \ref TopologyWorkers for more details.
starpu_worker_get_current_task_exp_end
Return when the current task is expected to be finished.
starpu_worker_get_devid
Return the device id of the worker \p id. The worker should be identified with the value returned by the starpu_worker_get_id() function. In the case of a CUDA worker, this device identifier is the logical device identifier exposed by CUDA (used by the function \c cudaGetDevice() for instance). The device identifier of a CPU worker is the logical identifier of the core on which the worker was bound; this identifier is either provided by the OS or by the library hwloc in case it is available. See \ref TopologyWorkers for more details.
starpu_worker_get_devids
See \ref TopologyWorkers for more details.
starpu_worker_get_devnum
See \ref TopologyWorkers for more details.
starpu_worker_get_hwloc_cpuset
If StarPU was compiled with \c hwloc support, return a duplicate of the \c hwloc cpuset associated with the worker \p workerid. The returned cpuset is obtained from a \c hwloc_bitmap_dup() function call. It must be freed by the caller using \c hwloc_bitmap_free(). See \ref InteroperabilityHWLOC for more details.
starpu_worker_get_hwloc_obj
If StarPU was compiled with \c hwloc support, return the \c hwloc object corresponding to the worker \p workerid. See \ref SchedulingHelpers for more details.
starpu_worker_get_id
Return the identifier of the current worker, i.e the one associated to the calling thread. The return value is either \c -1 if the current context is not a StarPU worker (i.e. when called from the application outside a task or a callback), or an integer between \c 0 and starpu_worker_get_count() - \c 1. See \ref HowToInitializeAComputationLibraryOnceForEachWorker for more details.
starpu_worker_get_id_check
Similar to starpu_worker_get_id(), but abort when called from outside a worker (i.e. when starpu_worker_get_id() would return \c -1). See \ref HowToInitializeAComputationLibraryOnceForEachWorker for more details.
starpu_worker_get_ids_by_type
Get the list of identifiers of workers of \p type. Fill the array \p workerids with the identifiers of the \p workers. The argument \p maxsize indicates the size of the array \p workerids. The return value gives the number of identifiers that were put in the array. -ERANGE is returned is \p maxsize is lower than the number of workers with the appropriate type: in that case, the array is filled with the \p maxsize first elements. To avoid such overflows, the value of maxsize can be chosen by the means of the function starpu_worker_get_count_by_type(), or by passing a value greater or equal to \ref STARPU_NMAXWORKERS. See \ref TopologyWorkers for more details.
starpu_worker_get_local_memory_node
Return the memory node associated to the current worker. See \ref TopologyWorkers for more details.
starpu_worker_get_memory_node
Return the identifier of the memory node associated to the worker identified by \p workerid. See \ref TopologyWorkers for more details.
starpu_worker_get_memory_node_kind
Return the type of memory node that arch type \p type operates on. See \ref TopologyWorkers for more details.
starpu_worker_get_name
Get the name of the worker \p id. StarPU associates a unique human readable string to each processing unit. This function copies at most the \p maxlen first bytes of the unique string associated to the worker \p id into the \p dst buffer. The caller is responsible for ensuring that \p dst is a valid pointer to a buffer of \p maxlen bytes at least. Calling this function on an invalid identifier results in an unspecified behaviour. See \ref TopologyWorkers for more details.
starpu_worker_get_perf_archtype
Return the architecture type of the worker \p workerid.
starpu_worker_get_relative_speedup
Return an estimated speedup factor relative to CPU speed. See \ref SchedulingHelpers for more details.
starpu_worker_get_relax_state
Return \c !0 if the current worker \c state_relax_refcnt!=0 and \c 0 otherwise. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_worker_get_sched_condition
When there is no available task for a worker, StarPU blocks this worker on a condition variable. This function specifies which condition variable (and the associated mutex) should be used to block (and to wake up) a worker. Note that multiple workers may use the same condition variable. For instance, in the case of a scheduling strategy with a single task queue, the same condition variable would be used to block and wake up all workers.
starpu_worker_get_sched_ctx_id_stream
starpu_worker_get_sched_ctx_list
See \ref TopologyWorkers for more details.
starpu_worker_get_stream_workerids
See \ref TopologyWorkers for more details.
starpu_worker_get_subworkerid
See \ref TopologyWorkers for more details.
starpu_worker_get_type
Return the type of processing unit associated to the worker \p id. The worker identifier is a value returned by the function starpu_worker_get_id()). The return value indicates the architecture of the worker: ::STARPU_CPU_WORKER for a CPU core, ::STARPU_CUDA_WORKER for a CUDA device, and ::STARPU_OPENCL_WORKER for a OpenCL device. The return value for an invalid identifier is unspecified. See \ref TopologyWorkers for more details.
starpu_worker_get_type_as_env_var
Return worker \p type as a string suitable for environment variable names (CPU, CUDA, etc.). See \ref TopologyWorkers for more details.
starpu_worker_get_type_as_string
Return worker \p type as a string. See \ref TopologyWorkers for more details.
starpu_worker_get_type_from_string
Return worker \p type from a string. Returns STARPU_UNKNOWN_WORKER if the string doesn’t match a worker type. See \ref TopologyWorkers for more details.
starpu_worker_is_blocked_in_parallel
Return whether worker \p workerid is currently blocked in a parallel task. See \ref SchedulingHelpers for more details.
starpu_worker_is_combined_worker
See \ref SchedulingHelpers for more details.
starpu_worker_is_slave_somewhere
See \ref SchedulingHelpers for more details.
starpu_worker_lock
Acquire the sched mutex of \p workerid. If the caller is a worker, distinct from \p workerid, the caller worker automatically enters a relax state while acquiring the target worker lock. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_worker_lock_self
Acquire the current worker sched mutex. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_worker_relax_off
Must be called after a potentially blocking call is complete, to restore the relax state in place before the corresponding starpu_worker_relax_on(). Decreases \c state_relax_refcnt. Calls to starpu_worker_relax_on() and starpu_worker_relax_off() must be properly paired. This function is automatically called by starpu_worker_unlock() after the target worker has been unlocked. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_worker_relax_on
Allow other threads and workers to temporarily observe the current worker state, even though it is performing a scheduling operation. Must be called by a worker before performing a potentially blocking call such as acquiring a mutex other than its own sched_mutex. This function increases \c state_relax_refcnt from the current worker. No more than UINT_MAX-1 nested starpu_worker_relax_on() calls should performed on the same worker. This function is automatically called by starpu_worker_lock() to relax the caller worker state while attempting to lock the target worker. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_worker_sched_op_pending
Return \c !0 if current worker has a scheduling operation in progress, and \c 0 otherwise.
starpu_worker_trylock
Attempt to acquire the sched mutex of \p workerid. Returns \c 0 if successful, \c !0 if \p workerid sched mutex is held or the corresponding worker is not in a relax state. If the caller is a worker, distinct from \p workerid, the caller worker automatically enters relax state if successfully acquiring the target worker lock. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_worker_type_can_execute_task
Return true if worker type can execute this task. See \ref SchedulingHelpers for more details.
starpu_worker_unlock
Release the previously acquired sched mutex of \p workerid. Restore the relax state of the caller worker if needed. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_worker_unlock_self
Release the current worker sched mutex. See \ref DefiningANewBasicSchedulingPolicy for more details.
starpu_worker_wait_for_initialisation
Wait for all workers to be initialised. Calling this function is normally not necessary. It is called for example in tools/starpu_machine_display to make sure all workers information are correctly set before printing their information. See \ref PauseResume for more details.
starpu_workers_get_tree
See \ref TopologyWorkers for more details.

Type Aliases§

__builtin_va_list
__gnuc_va_list
hipblasHandle_t
starpu_arbiter_t
This is an arbiter, which implements an advanced but centralized management of concurrent data accesses, see \ref ConcurrentDataAccess for the details.
starpu_bubble_func_t
@ingroup API_Bubble Hierarchical Dags Bubble decision function
starpu_bubble_gen_dag_func_t
@ingroup API_Bubble Hierarchical Dags Bubble DAG generation function
starpu_cluster_types
@deprecated Use ::starpu_parallel_worker_types
starpu_codelet_type
Describe the type of parallel task. See \ref ParallelTasks for details.
starpu_cpu_func_t
CPU implementation of a codelet.
starpu_cuda_func_t
CUDA implementation of a codelet.
starpu_data_access_mode
Describe a StarPU data access mode
starpu_data_handle_t
StarPU uses ::starpu_data_handle_t as an opaque handle to manage a piece of data. Once a piece of data has been registered to StarPU, it is associated to a ::starpu_data_handle_t which keeps track of the state of the piece of data over the entire machine, so that we can maintain data consistency and locate data replicates for instance. See \ref DataInterface for more details.
starpu_data_interface_id
Identifier for all predefined StarPU data interfaces
starpu_drand48_data
starpu_free_hook
starpu_hip_func_t
HIP implementation of a codelet.
starpu_is_prefetch
Prefetch levels
starpu_malloc_hook
starpu_max_fpga_func_t
Maxeler FPGA implementation of a codelet.
starpu_node_kind
Memory node Type
starpu_notify_ready_soon_func
starpu_omp_proc_bind_value
Set of constants for selecting the processor binding method, as defined in the OpenMP specification. \sa starpu_omp_get_proc_bind()
starpu_omp_sched_value
Set of constants for selecting the for loop iteration scheduling algorithm (\anchor OMPFor) as defined by the OpenMP specification. \sa starpu_omp_for() \sa starpu_omp_for_inline_first() \sa starpu_omp_for_inline_next() \sa starpu_omp_for_alt() \sa starpu_omp_for_inline_first_alt() \sa starpu_omp_for_inline_next_alt()
starpu_opencl_func_t
OpenCL implementation of a codelet.
starpu_parallel_worker_types
These represent the default available functions to enforce parallel_worker use by the sub-runtime
starpu_perf_counter_scope
Enum of all possible performance counter scopes.
starpu_perf_counter_type
Enum of all possible performance counter value type.
starpu_perf_knob_scope
Enum of all possible performance knob scopes.
starpu_perf_knob_type
Enum of all possible performance knob value type.
starpu_perfmodel_per_arch_cost_function
starpu_perfmodel_per_arch_size_base
starpu_perfmodel_state_t
starpu_perfmodel_type
todo
starpu_prof_tool_cb_func
starpu_prof_tool_command
todo
starpu_prof_tool_driver_type
todo
starpu_prof_tool_entry_func
A function with this signature must be implemented by external tools that want to use the callbacks
starpu_prof_tool_entry_register_func
Register / unregister events
starpu_prof_tool_event
Event type
starpu_pthread_attr_t
starpu_pthread_barrier_t
starpu_pthread_barrierattr_t
starpu_pthread_cond_t
starpu_pthread_condattr_t
starpu_pthread_key_t
starpu_pthread_mutex_t
starpu_pthread_mutexattr_t
starpu_pthread_rwlock_t
starpu_pthread_rwlockattr_t
starpu_pthread_t
starpu_sem_t
starpu_ssize_t
starpu_tag_t
Define a task logical identifier. It is possible to associate a task with a unique tag chosen by the application, and to express dependencies between tasks by the means of those tags. To do so, fill the field starpu_task::tag_id with a tag number (can be arbitrary) and set the field starpu_task::use_tag to 1. If starpu_tag_declare_deps() is called with this tag number, the task will not be started until the tasks which holds the declared dependency tags are completed.
starpu_task_bundle_t
Opaque structure describing a list of tasks that should be scheduled on the same worker whenever it’s possible. It must be considered as a hint given to the scheduler as there is no guarantee that they will be executed on the same worker.
starpu_task_status
todo
starpu_trs_epoch_t
starpu_worker_archtype
Worker Architecture Type
starpu_worker_collection_type
Types of structures the worker collection can implement
va_list

Unions§

starpu_prof_tool_event_info
Event info