Oct 17, 2021

Texture Mapping (Tutorial)

While this blog has featured mostly old skool vector graphics, which were very popular some 30 years back as there was basically no hardware support for texture mapping and CPU power was what it was, I decided to try that using the usual suspects Python, NumPy and pygame. Of course, handling graphics pixel by pixel is much harder (and slower) than drawing one-color surfaces. Even so, the first real time texture mapping I am aware of (in computer graphics that is) dates as far as September 1991: the Cube-o-Matic Amiga demo by Spreadpoint. Quite a feat, I must say, even if the resolution is only about 100 x 100 pixels and updated perhaps 10-15 frames per second.

My basic PC handles about 5 million pixels per second in this implementation, perhaps 50 times more than Cube-o-Matic - a very modest improvement considering the resources. I used images from the Sound Vision demo I pythonized some time ago: credits for some of the images go to Red Baron and Overlander, and Jellybean for the music. 



As you can see from the video above, images can be mapped on any (convex) flat surface in 3D space. Besides the cube (squares), they are mapped on the faces of a dodecahedron (pentagons) and an icosahedron (triangles). 

The Python script and the images can be found in my github repository. If you run the program and wish to control the object's rotation, use cursor keys and a and z keys to rotate around the three axes.

Reading a Map, Affinely


To map a texture (image) to a 3D rotated flat surface one needs to form a new image, of different size and shape, where each pixel's color corresponds to the color of the source image's pixel at the same position. There are many advanced methods, and for advanced, read: slow and hard to implement; I chose perhaps the simplest way, which is called Affine Texture Mapping. This, according to Wikipedia, is the fastest form of texture mapping and was implemented already in e.g. the original PlayStation (1994) hardware.

Affine mapping basically means linearly interpolating the pixels on the rotated surface to the original image, and picking a color for each pixel. There are ways to correct for perspective distortions or sample the original image in a more intelligent way, but they, in this implementation and in my opinion, would bring very little benefit compared to the extra processing cost.

Linear interpolation is easy between two coordinates. For a square, even if rotated to any angle, it would be easy to linearly interpolate between the pixels on opposite sides, although due to perspective the other side might be shorter than the other. However, interpolating on lines not strictly horizontal or vertical would result in some pixels being processed many times and/or some pixels not at all. Hence, it is a much better approach to process the rotated surface line by line (horizontally or vertically), as then each pixel can be processed exactly once. I chose starting from the highest corner of the square and processing lines going down. But then one ends up with corners on the way - the process must be broken down to pieces (or segments) where there are none. 

How to do Texture Mapping


The image and video below show the tutorial mode of the program (if self.tutorial_mode is set to false, it will run like the video above). First of all, for each surface (face of the cube) I have defined the picture to be used, and also, for each node (corner) of the surface, the respective coordinate within the picture. 



See the blue triangle, red trapezoid (a trapezium in Britain), and green triangle in the image? They represent the segments which can each be processed by linear interpolation. As said I start processing the rotated surface from the top; starting point is the top of the blue triangle. Then, I can go down as far as the first corner on the left. Note that the triangle is strictly horizontal at the bottom, representing a line of pixels. To be sure to process each pixel, it is necessary to go down the blue triangle horizontal line by line, in effect increasing the y coordinate one by one, and processing all the x coordinates between the two triangle sides.

To understand the mapping, it is perhaps helpful to look at how the cube object is defined: If you wish to look into how I define and rotate 3D vector objects in more detail, see my earlier series on Real-Time 3D Vector Graphics with Python.

            # define a cube.

            self.nodes = (100.0 * np.array([
                [ 1,  1,  1],
                [ 1,  1, -1],
                [-1,  1, -1],
                [-1,  1,  1],
                [ 1, -1,  1],
                [ 1, -1, -1],
                [-1, -1, -1],
                [-1, -1,  1]
                ])).astype(np.float)
            self.surfaces = np.array([
                [0, 1, 2, 3],
                [4, 7, 6, 5],
                [0, 4, 5, 1],
                [2, 6, 7, 3],
                [1, 5, 6, 2],
                [3, 7, 4, 0]
                ])

The cube's nodes are simply 100 units from the origo in each axis; and the six surfaces are defined by the four nodes at their corners. For example, the first surface is defined by the first four [0123] nodes, each node having +100 as its y coordinate (coordinates are x, y, z). The second surface is the opposite surface: nodes [4, 7, 6, 5] have y coordinate -100. (The reason for not using [4567] is that by defining surface nodes clockwise enables later calculating if they are "facing" towards the viewer (and should be drawn) or not.)

    def map_texture(self, surface):

        # build a node array where the nodes appear twice - enabling going "over" the right edge
        nodes_x2 = np.hstack((self.surfaces[surface, :], self.surfaces[surface, :]))
        # find the topmost node (minimum y cooridnate) and maximum y coordinate
        min_y_node = np.argmin(self.trans_nodes[self.surfaces[surface, :], 1])
        max_y = np.max(self.trans_nodes[self.surfaces[surface, :], 1])
        y_beg = self.trans_nodes[nodes_x2[min_y_node], 1]
        # when going "left" and "right" through the nodes, start with the top node in both
        (left, right) = (min_y_node, min_y_node)

The map_texture function takes surface as its only argument, so for the cube object this is between 0 and 5. The rotated surface's 2D screen coordinates are in self.trans_nodes. First, I look up the highest corner min_y_node from the surface's trans_nodes (which are a pair of (x, y) coordinates, hence subscript 1). Then I get the lowest and highest corner y levels into max_y and y_beg and initialize the left and right node pointers with min_y_node (ie. the highest corner).

Even before the above I have set up a nodes_x2 variable, which is simply the surface's nodes twice in one NumPy array. As an example, for surface[0], its nodes self.surfaces[0, :] would be [0, 1, 2, 3], and nodes_x2 would then be [0, 1, 2, 30, 1, 2, 3]. The reason? Let's say the min_y_node is node 3. Usually the lowest node with max_y is then node 1, the opposite corner on that face. As left and right start from node 3, I can now get to node 1 both by decreasing left and increasing right without checking whether I am already at the maximum number of items in the array.

Next, I loop through each section (in the image, the blue triangle, the red trapezoid, and the green triangle):

        # loop through each section from this y coordinate to the next y coordinate until all sections processed
        while y_beg < max_y:
            # image node depends on the node order
            img_node_beg = image_nodes[np.array([left % image_node_cnt, right % image_node_cnt]), :]
            img_node_end = image_nodes[np.array([(left - 1) % image_node_cnt, (right + 1) % image_node_cnt]), :]
            img_node_diff = img_node_end - img_node_beg
            # cube node comes from surface node list
            node_beg = self.trans_nodes[nodes_x2[np.array([left, right])], :]
            node_end = self.trans_nodes[nodes_x2[np.array([left - 1, right + 1])], :]
            node_diff = node_end - node_beg

Using the left and right node pointers, I set up img_node_beg, which contains the image coordinates to the section's top nodes on the left and right sides (in this case the same node). In the image below it is node 3. Note that for other surfaces the surface nodes could be e.g. [4765], while there are always just four image nodes for cube surfaces - that is why I am using the left and right pointers with modulo and image_node_cnt (which is 3 for triangles, 5 for pentagons etc.). Similarly, the section's bottom nodes' coordinates on the left and right sides are set up in img_node_end: in this example, they are 2 and 0, respectively - going left and right from the top node 3 in[01230123].


The same is done using the coordinates of the rotated surface i.e. self.trans_nodes. Here, node_beg and node_end contain the screen coordinates of nodes 3 and 2 and 0. The img_node_diff and node_diff simply represent the vectors on the left and right side for both. 

Our section begins at y_beg.Then we need to know where it ends. That is simple:

            # find section end = y_end
            if node_end[1, 1] < node_end[0, 1]:
                # right node comes first (i.e. its Y coordinate is before left's)
                right += 1
                y_end = node_end[1, 1]
            else:
                # left node comes first (i.e. its Y coordinate is before or equal to right's)
                left -= 1
                y_end = node_end[0, 1]

Once we know which one comes first, the next node on the right or left, we can also increase the respective pointer, because that is the side we need to move forward for the next section.

Now we can get working on this section on a line by line level.

Section, Part, Portion, Piece, Segment, Fragment


Our blue triangle section has a left side beginning at node 3 and ending at node 2, and a right side beginning at node 3 but ending at point m somewhere between nodes 3 and 0. Of course, we know where m is, as it must have the same y coordinate as node 2 to keep our lines horizontal. Let's move forward.

            if y_end > y_beg:
                y = np.arange(y_beg, y_end, dtype=np.int16)
                # node multipliers for each y for left and right side. Since y_end is the first node down, node_diff[:, 1] is here always > 0
                m = (y[:, None] - node_beg[:, 1]) / node_diff[:, 1]
                # respective screen x coordinates for left and right side
                x = (np.round(node_beg[:, 0] + m * node_diff[:, 0])).astype(np.int16)
                x_cnt = np.abs(x[:, 1] - x[:, 0])  # + 1   - use +1 when using linspace method below
                # count cumulative pixel count to use as the offset when storing data
                x_cnt_cum = np.hstack((np.array([0]), np.cumsum(x_cnt))) + self.scr_cnt
                # respective image coordinates, interpolating between image nodes (usually corners)
                img_l = img_node_beg[0, :] + m[:, 0:1] * img_node_diff[0, :]
                img_r = img_node_beg[1, :] + m[:, 1:2] * img_node_diff[1, :]

First, there needs to be a check if the section has zero lines. This would happen if e.g. the top two nodes of the rotated surface would have the same y coordinate. If not, then I build the section's y coordinates into - for example, if y_beg = 100 and y_end = 120 then y would have the 20 integers starting at 100 and ending at 119. The multipliers m are then calculated for each y and for both left and right side. In the blue triangle, the left multiplier would start from zero and end at 0.95 (19 / 20), but the right multiplier would (while starting from zero as well) end at somewhere around 0.4, as that is where point m lies on the way from node 3 to node 0.

Now we know the y coordinates for each horizontal line, and can calculate the respective left and right x coordinates using the multipliers. From those we can get the number of pixels on each horizontal line (x_cnt) and the cumulative count x_cnt_cum to use as a data pointer when storing the results. 

The image coordinates, obviously, do not lie on a horizontal line. We need to calculate the left and right end of the image data vector our horizontal line above represents, and that can be calculated using the image nodes and the same multipliers. 

Going Pixels


Now we have the screen coordinates of each horizontal line in y and x, and the respective image vector beginning and end coordinates at img_l and img_r. It's time to go through each line and figure out the surface color pixel by pixel.

                for y_line in range(y_end - y_beg):
                    # process each horizontal line, these are the x coordinates from x left to x right
                    if x_cnt[y_line] > 1:
                        # if "left" not on the left side, use negative step.
                        scr_x = np.arange(x[y_line, 0], x[y_line, 1], np.sign(x[y_line, 1] - x[y_line, 0]), dtype=np.int16)
                        # add x coordinates to self.scr_x array
                        self.scr_x[x_cnt_cum[y_line]:x_cnt_cum[y_line + 1]] = scr_x
                        # add y coordinates similarly - y is constant
                        self.scr_y[x_cnt_cum[y_line]:x_cnt_cum[y_line + 1]] = y_line + y_beg
                        # interpolate between line begin and end coordinates in image
                        self.img_xy[x_cnt_cum[y_line]:x_cnt_cum[y_line + 1], :] = (img_l[y_line, :] + ((scr_x - scr_x[0]) / (scr_x[-1] - scr_x[0]))[:, None] * (img_r[y_line, :] - img_l[y_line, :])).astype(np.int16)
                        # store the color found in each interpolated pixel in self.scr_col
                        self.scr_col[x_cnt_cum[y_line]:x_cnt_cum[y_line + 1]] = image_array[self.img_xy[x_cnt_cum[y_line]:x_cnt_cum[y_line + 1], 0], self.img_xy[x_cnt_cum[y_line]:x_cnt_cum[y_line + 1], 1]]

Note that our "left" coordinate may not actually be on the left; it is more the direction of going through the nodes, and they are constantly rotated. Anyway, scr_x represents all the x coordinates on the horizontal line. These and the y coordinates are stored in self.scr_x and self.scr_y.

As we know the image data vector, i.e. the line within the image we want to project on this horizontal line, it is simply another interpolation required to get the same number of coordinates (as in scr_x) on that vector, evenly spaced. Those coordinates are stored in self.img_xy

And finally, we need the colors of those image pixels. They are stored in self.scr_col and picked from image_array, which is a pygame.surfarray.pixels2d object representing the image. 

That's it - the section has been processed and the screen x, y, and colors stored. Then it's time to set y_beg = y_end and process the next section. For the red trapezoid, the node_beg and node_end will now contain the screen coordinates of nodes 2 and 3 & 1 and 0 as the left side has moved forward while the right side has not, and y will be between the y coordinates of nodes 2 and 0.

When all surfaces have been thus processed, it is time to actually update the screen. That is done simply by updating the pygame.surfarray.pixels2d(self.screen) by setting each pixel defined by coordinates self.scr_xself.scr_y to color self.scr_col as stored previously. It would have been possible to do this in smaller pieces when collecting the data, but it seems to be faster to store all data first and then update in one go, although it does consume more memory.

        # update screen in one go by setting the colors for the mapped pixels.
        if self.scr_cnt > 0:
            screen_array = pygame.surfarray.pixels2d(self.screen)
            screen_array[self.scr_x[0:self.scr_cnt], self.scr_y[0:self.scr_cnt]] = self.scr_col[0:self.scr_cnt]

And in case you are wondering how the animated image with the green balls and red and blue cogwheels is done, it is simply the case of having ten still images and switching between those images when getting the colors for the pixels.

Singularity


The process is very CPU intensive and more than 90 % of time is spent processing the segments line by line. It would be great to use more than one processor core to share that load, but when I tried that, the end results was slower execution. Perhaps I did not do it the right way or perhaps there are some locks I missed; or maybe the overhead, when running real time at 30 to 50 fps, was too much. 

Of course, using hardware to do the same would be way faster. Also much faster, even if not that much, would be using some optimized routines instead of high level Python code. Using e.g. OpenGL would boost performance probably a lot, although it seems rather complex when looking at PyOpenGL. Perhaps some day - although using that would mean missing all the fun of making it yourself!

No comments:

Post a Comment