📓 nodes/20230628195927-domain.org by @tekakutli-org ☆

ARCHITECTURE

PHYSICS - CHEMICAL - PARTICLES

Complex Physics with Graph Networks https://arxiv.org/pdf/2002.09405
PAC-NeRF PhysicsAugmented Continuum Neural Radiance Fields
Scaling Spherical CNNs: vs graph neural network, molecular
- better than the spectral domain through the convolution theorem

DUSt3R Geometric3D Vision Made Easy
- global alignment of pixels from sparse views, no need for camera position

3D moleculegeneration by denoising voxel grids
- diffusion model applied to atom point clouds
DiffPoint Single and Multi-view Point Cloud Reconstruction with ViT Based Diffusion Model
- divide the noisy point clouds into irregular patches, target points based on input images
PointInfinity Resolution-Invariant Point Diffusion Models
- efficient training low-resolution point clouds, allowing high-resolution generated during inference
- transformer-based architecture with a fixed-size, resolution-invariant latent representation

biology inspired AI: genes or local-context-evolution https://youtu.be/vf18FLdKkY4 CPPN algorithm
- next paper: hyper http://axon.cs.byu.edu/~dan/778/papers/NeuroEvolution/stanley3**.pdf
  - continuation: http://eplex.cs.ucf.edu/ESHyperNEAT/
- neat algorithm: https://youtu.be/3nbvrrdymF0
The AI Epiphany NeuroEvolution of Augmenting Topologies (NEAT) and Compositional Pattern Producing Networks (CPPN)
Forward-Forward(vs Backpropagation analog computers

JEPA - https://youtu.be/jSdHmImyUjk
- Self-Supervised Learning, Energy-Based Models, and hierarchical predictive
  - the encoder ignoring useless information
- https://openreview.net/forum?id=BZ5a1r-kVsf
Energy Transformermore efficient, electric
- transformers without skip connections or normalisation layers https://arxiv.org/pdf/2302.10322.pdf
- Conformers: local and global attention

NERF ALIKESrepresentations for video-image
supervision timeobjects spend in the zone
- like employees at their table, cars at parking lot

Interactive GarmentRecommendation with User in the Loop (algorithm)
- ingesting user feedback so to improve its recommendations and maximize user satisfaction

OS-Copilot Towards Generalist Computer Agents with Self-Improvement
- strong generalization to unseen applications via accumulated skills from previous tasks
OSWorld Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
- accomplish complex computer tasks with minimal human intervention
- multimodal agents

open-sourceRabbit
WebVoyager Building an End-to-End Web Agent with Large Multimodal Models
- interacting with real-world websites

Neuralfeels with neural fields: Visuo-tactile perception for in-hand manipulation
- tracking and reconstruction of novel objects for in-hand manipulation
MACS Mass Conditioned 3D Hand and Object Motion Synthesis
- improve naturalness of the synthesized 3D hand object motions
- generalize to unseen masses
CyberDemo Augmenting Simulated Human Demonstration for Real-World Dexterous Manipulation
- simulated human demonstrations for real-world tasks

PAM: A ParallelAttention Network for Cattle Face Recognition
- focuses on local and global features
- for animal husbandry and behavioral research

Towardsmitigating uncann(eye)ness in face swaps via gaze-centric loss terms
- novel loss equation for the training of face swapping models

A Unifiedand Interpretable Emotion Representation and Expression Generation
- compound emotions

CAPTIONING
Text Injectionfor Capitalization and Turn-Taking Prediction in Speech Models
- unpaired text-only data used to enhance paired audio-text data
- to detect turns
VideoReTalking Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

CityDreamer Compositional Generative Model of Unbounded 3D Cities (imagining map layout city)
GENERATE BLENDER

ActiveNeural Mapping; scene reconstruction, gain knowledge of the environment
Doppelgangers Learning to Disambiguate Images of Similar Structures
- can distinguish illusory matches in difficult cases, then spatial distribute local keypoints