Projects
Electrical engineering
Microarchitecture (x86 processor) (UT EE382N-19)

Our major design objective was to create a somewhat realistic, in-order, x86 processor. Since it was in-order, maximizing MLP and leveraging locality (to improve latency) was of utmost importance. Thus we put in features such as: multiple MSHRs, a Store and Eviction Buffer, Critical-word first, and Memory Banks with Row bu↵ers. Additionally, in order to ensure instruction throughput, we added a branch predictor, instruction queue, a banked register file, and data forwarding (with scoreboarding).
Realtime GPU Raytracing (UT EE382N-20)

In this paper, we aim to explore current methods for realtime raytracing and attempt to elaborate upon them in order to produce a near realtime implementation. Additionally, while some prior work only implement partial features (e.g. only primary and shadow rays), we aim to support primary and all secondary (refraction, reflection, light/shadow) rays as well as arbitrary triangle meshes and textures. We decided to implement the final raytracer in CUDA due to the availability of NVIDIA hardware and their toolset.
Lightcuts and Illumination (UT CS384G)

While acceleration structures such as k-d trees or bounding volume hierarchies greatly reduce the ray-object intersection complexity, every light source still needs to be assessed during the Phong shading. This can be troublesome for scene with large numbers of light sources, luminous objects, or global illumination. Lightcuts is a technique that seeks to reduce the overhead of iterating through a large number of light sources, by combining nearby light sources together. This of course comes with a certain amount of error, an error for which a threshold can be preset. This projects seeks to implement objects and methods which creates large amounts of light sources within the scene and demonstrate the acceleration in scene rendering with a large number of light sources using lightcuts.
Node.js Asynchronous Compute-Bound Multithreading (UT EE382V)

We propose a system for asynchronous multithread- ing in Node.js. By using annotations and extending V8 and Node.js, common asynchronous paradigms can be accelerated past the single-threaded model of Node.js. Our system extends the ECMA 5.1 directive syntax to allow for compute-bound annotations. Additionally, V8 is modified to allow Node.js to run a function in a separate V8 instance (Isolate). Upon completion of the compute-bound function, a callback is executed within the main event loop of Node.js. In our initial results, we recorded at least a 3x improvement (6x maximum) in speedup over a grouping of V8 and custom compute-bound benchmarks on a 4-core Intel Core-i5 system.
Highly Configurable Power Virus for GPGPUs (UT EE382M-15)

As GPUs are becoming common-place co- processors, methods for improving power usage without affecting performance are being analyzed. Currently, GPUs are transistor-dense devices, with many independent cores, yielding high power usage. One of the concerns shared by both CPUs and GPUs is how much power they will use when operating at peak performance and how to develop programs that will allow developers to test their system when at or near that point. For multicore microprocessors, tools have been created for developing power viruses that can stress test how well the system handles the additional power and heat. These tools allow the developer to create tailored power viruses by providing parameters of the system to test without knowing the fine grain details of the every functional unit in the processor that would be necessary to create the same kind of program by hand. However, this kind of tool has not been created for GPGPUs and existing multiprocessor capabilities for general purpose microprocessors cannot be used for this purpose. We created a tool to create power viruses for various GPU architectures using GPGPU-Sim-Wattch to simulate the virus and measure its power usage. In the end we were able to create power viruses for the GTX 480 and Quadro FX5600 that came close to or exceeded the thermal limits for these cards and discovered correlations between certain hardware features and the corresponding parameters to our code generator that get close to producing the best power viruses. We also found Plackett and Burman design to be very helpful for tuning the search space for our genetic algorithm and for discovering trends in our results.
A Study of 3DIC Kogge-Stone Circuits (UT EE382N-14)

Component delays have traditionally been the focus for performance optimization of logic circuits. As component sizes and delays shrink, wire delays become more significant. Awareness and planning of component placement to optimize wire length then becomes more valuable. The addition of another dimension through die stacking creates more opportunities for shortening this length by shrinking the absolute distance between components. Through Surface Vias (TSVs) act as the electrical connection between die layers. We’ve found that TSVs have RC characteristics that create a communication delay that is less than the delay created by the length of wire needed to separate modules in a two dimensional space, meaning the incurred delay can be lessened by separating the modules by die layer. While TSVs have beneficial delay characteristics, they have an area requirement far more expensive than their wire counterparts, which may make them hard to place in the design, especially if many signals need to transfer layers. To demonstrate the effectiveness of this method we choose a Kogge-Stone adder because of it notoriety for having many long wires.
GA Assisted SDF Scheduling for Energy-Aware Mapping of Heterogeneous Processors (UT EE382V)

We present a method of optimizing both energy and latency of a Synchronous Dataflow model (SDF) using genetic algorithm (GA) assisted list scheduling. Our solution is fully automatic requiring only the SDF definition, processor specifications, and communication bus specifications. Using these parameters, latency and energy are accurately modeled taking into account dynamic energy, static energy, and DVFS overhead. Actor firings are then optimized for minimal latency and energy using SPEA2, a multi-objective GA. The resulting schedule defines the overall timing schedule, DVFS schedule, and processor mapping. In initial tests, energy-optimized schedules can consume up to 38.5% less energy, with a minimal effect on latency. Our final system is fully configurable and versatile, working on SDFs (for DSP applications) and task graphs (for general applications).
A Piezoelectric Energy-Harvesting Shoe System for Podiatric Sensing (OSU Capstone, accepted to EMBC '14)

This paper provides an energy-harvesting, shoe- mounted system for medical sensing using piezoelectric trans- ducers for generating power. The electronics are integrated inside a conventional consumer shoe, measuring the pressure of the wearer’s foot exerted on the sole at six locations. The electronics are completely powered by the harvested energy from walking or running, generating 10-20 uJ of energy per step that is then consumed by capturing and storing the force sensor data. The overall shoe system demonstrates that wearable sensor electronics can be adequately powered through piezoelectric energy-harvesting.
Web development
The Wave



The Wave was a website that I worked on for a research project. It used AWS and connected with Facebook for posting user’s physical activity (from a multitude of sensors, e.g. FitBit). The website utilized the Google graph API in order to create customized reports that can be generated at custom durations.
Rundezvous


Rundezvous was a running/activity tracking website. It allows users to track activities, create events, map out trails, and create groups. All types of posts are integrated together, allowing for easy activity creation beased upon events or trails. For this project, I created a custom PHP backend which connected to a MySQL database. Additionally, all styles and graphics were designed by me (Illustrator/Photoshop).
Taskoid

Taskoid was an experiment to improve upon current task applications. The main points of the application was to enable easy task creation through written text (similar to Remember the Milk). Other features such as lists, tags, etc would be supported as well. My friend and I planned on working on a web version of the app as well as mobile versions and a REST API.
Storylines


To fill the gap of rundezvous, I made a quick experiment in about a week. For this I utilized HTML5 (canvas, CSS3), Object-Oriented PHP, and AJAX. I also used a cool technique to make triangles with just CSS. Anyway, the project is called Storylines, and it’s a collaborative story writing website/app. For the login, I just used Facebook purely (to save time). There were some interesting quirks with the idea, since only one person could edit a story at a time (plot contingency). To do this I used a technique called AJAX long polling, basically where the AJAX request online responds when something has changed. So basically, a person is put in a queue for editing the story, and has a certain amount of (idle) time to post a line/drawing. Some other features were drawings (using canvas, a colorpicker, jquery ui), branching stories (from a line), and reporting posts (fades away, and then hides).
Game development
Tetris
This is my attempt of replicating the game: Tetris. I’ve always wanted to know how it worked or try to replicate it. It seems that while it seems simple, most of the work goes into performance. My version is completely based upon arrays and array transformations. I was eventually thinking of making an AI to go with the game, but there are other things that need to be added/changed.
Boxarriffic





A reaction-type game in which you must click all the green boxes before the timer runs out, otherwise you lose a life. There are 54 levels total, each increasing in difficulty. And on each level you must go through 5 sets of highlighted boxes. You get points based off of how quickly you click the boxes and retrying a level deducts 150 pts from your score. You can submit your score if it is a high score, though it will clear your game.
Pop Rocks
Asteroids and enemies will be created to the beat of the music. Shoot asteroids for points and power-ups (firing for most power-ups is needed). When your ship spins out of control, turn the opposite way to counter-act it. You can use your shield to block damage it will regenerate over time. Blackholes will be created on large beats or changes in music, avoid them as well as the asteroids being drawn in. Meteor showers will send meteors moving in different directions. Meteors will damage asteriods/units but also your ship, unless shielded.