Projects

Anagnorisis

Completely local data-management platform with built in trainable recommendation engine.

The core idea is to create a self-hosted local-first media platform where you can rate your data, and the system trains a personal model to understand your preferences. This model then sorts your data based on your predicted interest, creating a personalized filter for any type of media you might have — images, music, videos, articles, and more.

Local First: All data always stays on your device only. All models are trained and inferenced locally.
AI Powered: Uses advanced embeddings (CLAP, SigLIP, Jina) to understand, search and filter your content and estimate preferences.
FullStack: Built with Flask, Bulma, Transformers, and PyTorch. Uses simple Docker setup for easy deployment.
Open Source: AGPL-3.0 license. Contributions, feedback and support are always welcome!

Status: Active Development.

View on GitHub

Projects Highlights

Vector Based Language

Research project exploring the possibility of creating a continuous, visual language that hint at the possibility of direct understanding of the embedding spaces produced by ML models.

Unlike traditional discrete languages, this system uses machine learning to generate unique visual representations (images) for any given text embedding. The goal is to create a language where concepts can blend into each other continuously, mirroring how neural networks process information.

Words and sentences are represented as points in a continuous vector space, allowing for smooth transitions between concepts. Experiments show that humans can learn to interpret these generated visual embeddings with increasing accuracy over time. The visual language is designed to be perfectly reconstructible back into the original text embeddings by a decoder network crating a bridge between human cognition and the “black box” mechanized interpretation of the data.

Status: Completed Experiment

Read the Article | View on GitHub

Projects

4D Volumetric Retina Simulation

A physics-based 4D path tracing simulation visualizing how a four-dimensional being might perceive the world through a 3D volumetric retina.

Current visualizations of 4D space often rely on wireframe projections or simple 3D cross-sections. This project takes a more biologically plausible approach: analogous to how we 3D beings perceive our world via 2D retinas, a 4D creature would likely possess a 3D (volumetric) retina.

This simulation implements a custom 4D path tracing engine (using Python and Taichi for GPU acceleration) to model light interactions within a hyper-scene containing a rotating tesseract. It simulates image formation by casting 4D rays onto a defined 3D retinal volume.

The simulation features physically-based rendering that models light bounces, shadows, and perspective in four spatial dimensions. Simulates a 3D sensor array rather than a flat plane. Implements a Gaussian fall-off for retinal sensitivity, mimicking foveal vision where the center of the 3D gaze is most acute. To make this comprehensible to human eyes, the 3D retinal image is composited from multiple depth slices, additively blended to represent the density of information a 4D being would process simultaneously.

Status: Completed Experiment

View on GitHub | Video Showcase

Projects

SD-CN-Animation

This project was developed as an extension for the Automatic1111 web UI to automate video stylization and enable text-to-video generation using Stable Diffusion 1.5 backbones. At the time of development, generated videos often suffered from severe flickering and temporal inconsistency. This framework addressed those issues by integrating the RAFT optical flow estimation algorithm. By calculating the motion flow between frames, the system could warp the previously generated frame to match the motion of the next one, creating a stable base for the diffusion model. This process, combined with occlusion masks, ensured that only new parts of the scene were generated while maintaining the consistency of existing objects.

The tool supported both video-to-video stylization and experimental text-to-video generation. In video-to-video mode, users could apply ControlNet to guide the structure of the output, allowing for stable transformations like turning a real-life video into a watercolor painting or digital art while preserving the original motion. The text-to-video mode employed a custom “FloweR” method to hallucinate optical flow from static noise, attempting to generate continuous motion from text prompts alone.

Development on this project was eventually discontinued as the field rapidly advanced. The emergence of modern, end-to-end text-to-video models provided much more coherent and faithful results than could be achieved by hacking image-based diffusion models, rendering this approach largely obsolete for general use cases.

Status: Not Maintained

View on GitHub

Projects

Generative Art Synthesizer

A Python program that generates Python programs that generate generative art.

Most generative art relies on stochastic processes where the initial seed and specific parameters are often lost, making exact reproduction difficult. GAS takes a different approach: instead of storing just the output or the parameters, it generates a fully deterministic, standalone Python script for each artwork. This ensures complete reproducibility—if you have the script, you have the art.

The core mechanism involves initializing a tensor with coordinate data and then applying a random sequence of mathematical transformations (like transit, sin, magnitude, shift) to its channels. These operations are restricted to the [-1, 1] range to ensure stability. The final result is a composition of these channels converted into color space.

Self-Contained Art: Each generated piece is a runnable Python script with zero external dependencies beyond standard scientific libraries (numpy, PIL).
Deterministic: The generated scripts contain no random elements; running the same script always produces the exact same image.
Method-Based generation: Uses a palette of composable mathematical functions (sin, prod, soft_min, etc.) to “sculpt” the image in a high-dimensional channel space.
Aesthetic Scoring: Includes a simple scoring model to estimate the visual quality of generated outputs.

Status: Completed Experiment

View on GitHub