cuda-rust-wasm 0.1.6

CUDA to Rust transpiler with WebGPU/WASM support
Documentation
# Module Structure Fixes Completed

## Issues Fixed:

1. **Fixed duplicate `kernel_function` macro export**
   - Removed duplicate macro definition from `error.rs`
   - Kept the proper implementation in `runtime/kernel.rs`
   - Fixed incorrect re-export attempts in `lib.rs`

2. **Fixed transpiler module structure**
   - Added missing `ast` module to `transpiler/mod.rs`

3. **Fixed backend_trait module exports**
   - Added `BackendTrait` export in `backend/mod.rs`
   - Added backward compatibility alias `Backend` for `BackendTrait`
   - Updated `BackendCapabilities` struct to include all required fields

4. **Created missing unified_memory module**
   - Implemented `UnifiedMemory` struct with proper allocation/deallocation
   - Added safety implementations for Send/Sync
   - Included helper functions and tests

5. **Fixed profiling module CudaError type**
   - Changed all references from `CudaError` to `CudaRustError`
   - Updated all IO error handling to properly convert to `CudaRustError`
   - Fixed export functions in all profiling submodules

6. **Fixed wasm_runtime module**
   - Updated to implement the correct `BackendTrait` interface
   - Added async trait support
   - Implemented all required trait methods
   - Fixed `BackendCapabilities` initialization

7. **Added missing error macros**
   - Added `memory_error!` macro to `error.rs`

## Module Structure Now:
- All imports should resolve correctly
- No duplicate macro exports
- All trait implementations match their definitions
- All error types are consistent

## Remaining Issues (not module-related):
- Type mismatches in AST structures (parser vs transpiler AST types)
- Missing method implementations in various structs
- These are implementation issues, not module organization issues