Jul 20, 2022

Ray casting - the Game

Building upon the work introduced in my previous post, I am adding the actual rendering of the walls, ceiling, and floor. If you wish to read about ray casting, read the previous post (there are only slight modifications on that code here) - in this post I am focusing on how to take the output of that and transform it to an "almost 3D" player view.

There is a simple game on top of the programming exercise, so for those interested in trying it, some instructions:

  • You have a limited field of sight, decreasing as the light you carry grows dimmer. Collect light bulbs to extend it! If your lamp goes out you lose.
  • Collect coins - all of them to win the game. You can see progress and score at the top right corner.
  • Use the map in the bottom right corner to navigate.
  • Steer left/right and up/down using cursor keys, control speed with a and z keys. You can also move backwards.

For graphics rendering PyGame is again used, and NumPy for the calculations. Unfortunately, this is very CPU intensive; processing all pixels of the screen takes time. To keep update frequency good enough I am using a resolution of only 720 x 400; you may test with lower or higher by changing the setting at the very end of the program.

All source code can be found in my GitHub repository. The graphics mostly come from a 30-year old Amiga Demo and the credits for them go to Nighthawk, Red Baron and Overlander, and to JellyBean for the music. See my post from May 2021 on Sound Vision Demo in Python for more about that. Also, there's another post on a more generic texture mapping routine in 3D, if you are interested. And, as a game, the Space Invaders I made a while ago is much more fun!

New Blocks on the Kid

To define how the walls will look like, I am setting up a new Class Block. A Block has very few properties; it only gets one argument, the image_list. The list may have zero, one, or four entries; if there are four, then each side of the Block will have its own texture. Also, if any of those entries is a list of images in itself, the list will be used to animate the texture.

class Block:

    def __init__(self, image_list=None):

        # if no image list given, set up as such (this is normally the " " block i.e. not a wall at all)
        if image_list is None:
            self.image_cnt = np.zeros((4))
            self.images = image_list  # image list must be a list of lists of images; but there may be only one image.
            # if given only one image (or image list) when set up, use it for all sides
            if len(image_list) == 1:
            self.image_cnt = np.array([len(self.images[0]), len(self.images[1]), len(self.images[2]), len(self.images[3])])

        self.image_used = np.zeros((4), dtype=np.float16)  # this is a float to keep accuracy; int() used when using this.
        self.image_cnt_max = np.amax(self.image_cnt)

    def animate(self, img_add):

        # img_add should determine how may images forward to go in the animation list (and may be e.g. 0.4) and naturally should depend on time.
        if self.image_cnt_max > 1:
            self.image_used = np.mod((self.image_used + img_add), self.image_cnt)

After having set up all the required Blocks, I can define a map. The map is defined simply based on these Blocks and another Class called Item. There are two types of Items: coins and light bulbs. They can also be added using the map, although they will always be converted to a "no wall" (or empty) Block and the respective Item then added in the middle of that Block. Note that the map must have wall Blocks at the edges so the player is kept within the area. 

        self.wall_blocks = []   # list of available blocks (class Block)
        self.map_blocks = {}    # dictionary connecting map block (alphanumeric character in self.map) to wall_blocks list index
        self.items = []         # list of items (class Item)
        self.coin_blocks = 'cdefg'
        self.light_block = 'L'
        self.map = ['11122211111111111115551111111111111111111111111',
                    '1   f                           2222          1',
                    '1   f   1f     L      efgfedcde  22    ceg L  1',
                    '1   e  13f11     22              66    gec    1',
                    '1   e  1ee1      22      55                   3',
                    '1   d   d41           L      efefef    131    3',
                    '1  6d                                  131    3',
                    '4     ccccddee      11       2111111          3',
                    '4                            2                3',
                    '4   f         g        2  c   2    4444    f  1',
                    '1   g   11    f  L111 1   c  2eeee L 555   e  1',
                    '1   f  15451  e  16 611   c  23335     6   d  1',
                    '1   g         d     ee    c      54444 6   c  1',
                    '1      L  11     cgcgcgcg                     1',
        self.map_array = np.zeros((len(self.map[0]), len(self.map)), dtype=np.uint8)

The self.map_blocks dictionary is created in setup_blocks: each alphanumeric character in the map needs a definition, unless if it is in self.coin_blocks or a self.light_block. That's done below (only Blocks " " (empty space) and "1" (one of the wall blocks) shown here). The empty space Block is initialized without any image list; the "1" Block with a four item list of image lists. This list defines the pictures to be used for the four sides (up, right, down, and left in map view), and the last one is a ten-picture animation of a ray tracing.

The map is then converted to a more readily usable NumPy array map_array of the same size.

    def setup_blocks(self):

        # set up all wall blocks by providing the alphanumeric "key" and image data for its sides.
        # simple blocks have only one image, used for all sides, others may have four (up, right, down, left in the map).
        # each image is given in a list, and if the list has many images, they will be animated.

        block_nr = 0
        # "nothing block" ie no walls or coins etc. at all. Must be the zero block. Using blank space for this.
        self.map_blocks[' '] = block_nr
        block_nr += 1
        self.map_blocks['1'] = block_nr
        self.wall_blocks.append(Block([[self.pic_alphaks2], [self.pic_vector3d], [self.pic_ball2],
                                       [self.pic_ray01, self.pic_ray02, self.pic_ray03, self.pic_ray04, self.pic_ray05,
                                        self.pic_ray06, self.pic_ray07, self.pic_ray08, self.pic_ray09, self.pic_ray10]]))

We Will Build a Great Wall

But Mexico will not pay for it, instead we are using open source... Anyway, as you may remember from my previous post, the raycast function will store some data (in self.ray_data) on what we see for each of the horizontal rays (800 rays if the display has 800 pixels horizontally): e.g. the Block, the side (one of four) of the Block, the map coordinates, and the distance to it - all these of the map location where our ray first hits a wall. This is perhaps easiest to see when looking at how the player view is drawn on the map view (the yellow area). So we know, for each vertical line, what and where our rays hit. From the Block and its side, we can get to the picture we need to use. From map coordinates, we can get the vertical line of that picture we must show (as each Block represents one map coordinate, map coordinate 10.5 means we see the middle (.5) of that picture etc.). And from the distance we can figure out if we even see that far; and if we do, how high the wall will be for this ray.

The raycast function will also store data on blocks of rays which can be drawn in one go (in self.grid_blocks). This essentially means that we can draw a portion of the walls, say, vertical lines (rays) between  x coordinates 124 and 188, using a single picture for texture mapping, as those rays are hitting one single Block. So, I am drawing the walls in chunks, the sizes of which depend on what can be drawn in one chunk and what needs to processed in the next.

To start drawing the walls, first I obtain a pygame.surfarray.pixels3d array to be able to manipulate the pixels directly with NumPy (see the Pygame tutorial on NumPy and Surfarray):

    def draw_walls(self, screen):

        # draw the walls, grid block by grid block.

        # obtain a surfarray on screen to modify it pixel by pixel with NumPy
        while screen.get_locked():
        rgb_array_dst = pygame.surfarray.pixels3d(screen)

        # calculate where floor and ceiling end to be able to clear grid blocks not drawn as walls
        y_floor = self.height / 2 + (self.wall_height * self.view_height) / self.view_blocks  # this is the most distant visible floor y
        y_ceiling = self.height / 2 - (self.wall_height * (1.0 - self.view_height)) / self.view_blocks  # this is the most distant visible ceiling y

        # process each grid block in one go
        for i in range(0, np.size(self.grid_blocks), 2):

            # check if the wall is visible
            block_nr = int(self.ray_data[self.grid_blocks[i], 0])
            if block_nr > 0:

                x_left = self.grid_blocks[i]
                x_right = self.grid_blocks[i + 1]
                x_num = x_right - x_left + 1
                # y_top and y_bottom are screen y coordinates for each x coordinate.
                y_top = self.ray_data[x_left:x_right + 1, 6:7]
                y_bottom = self.ray_data[x_left:x_right + 1, 7:8]
                y_min = int(max(0.0, min(y_top[0], y_top[-1])))
                y_max = int(min(self.height - 1, max(y_bottom[0], y_bottom[-1]) + 0.999999))
                y_num = y_max - y_min + 1

Then, to later efficiently clear the wall area if it is beyond our view (too far) I calculate the y coordinates of the floor and ceiling at the self.view_blocks distance. The view could be e.g. 10 blocks.

Each self.grid_blocks area is then processed separately. First I check the Block from the first ray of that grid_blockgrid_blocks are constructed so that it is constant for the all rays. If it is positive, I proceed with defining the screen x and y coordinates. The x coordinates are constants, but the y coordinates are arrays, as the left and right ends of the drawing area may be at different distances and, hence, heights.

For example, in the image above, I have framed two such grid_blocks, one in light green and one in red color. Note that the y coordinates may go outside the drawing area (the dotted lines), but y_min and y_max are restricted to the drawing area. The rectangular areas processed below are marked by black rectangles.

Next, I figure out the source picture data, based on the Block and side information, and obtain a pygame.surfarray.pixels3d array for that as well. 

To reiterate, I am processing a rectangular area defined by x_left and x_right (horizontally) and y_min and y_max (vertically), framed in black in the image above. Mapping the y coordinates (y_map) to the source picture then goes as follows:

                    # figure out which image to use
                    block = self.wall_blocks[block_nr]
                    side_nr = int(self.ray_data[self.grid_blocks[i], 1])
                    rgb_array_src = pygame.surfarray.pixels3d(block.images[side_nr][int(block.image_used[side_nr])])
                    (x_size, y_size) = np.shape(rgb_array_src)[0:2]

                    # map y coordinates on source image. This will be a (y_num, x_num) array transposed.
                    # if y_top is negative i.e. picture starts above screen area, y_map will be positive for y = 0 (i.e. starting in mid-image).
                    # if y_top > y_min (i.e. image height at x is smaller than the drawing area height, y_map will be negative for y = 0.
                    # ...and the same for the bottom side.
                    y_map = ((np.arange(y_min, y_max + 1, dtype=np.int16) - y_top) * y_size / (y_bottom - y_top)).astype(np.int16)

In practice:

  1. make an array of all y coordinates in the rectangle: np.arange(y_min, y_max + 1)
  2. for each vertical line (ray), subtract the minimum y i.e. y_top (note it is an array)
  3. scale the resulting y coordinates to the source by multiplying by source picture y_size and dividing by the height of the vertical line (y_bottom - y_top, both of which may go outside of screen area).

It is deceptively simple but works. For example, for the first (leftmost) vertical line (ray) in the grid_block framed with red in the image above, y_min will be zero and y_max will be screen height. Let's say y_top (the topmost pixel y coordinate) for that line is 100. Since y_map goes through all pixels from y_min to y_max, the first 99 will be negative, since they are above y_top. And, after y_map range goes beyond y_bottom, scaling will produce y coordinates that are greater than source picture y_size. For the dotted line areas in the right edge, y_min (being zero) is bigger than y_top (which is negative), and scaling will produce results starting from inside the source picture, not its edge. Note that the end result is an array of all pixels in the rectangle processed.

Mapping the x coordinates is simple, as all lines are vertical. However, depending on the side seen, and to avoid showing mirror images, the map x coordinate must be scaled slightly differently for each side. Basically:

  1. take the decimal parts of the map x coordinates (of each ray) by subtracting their floor from them
  2. multiply this by source picture x_size
  3. resize this (currently one horizontal line sized array) to cover all pixels of the rectangle processed

                    # map x coordinates on image based on side (0 = up, 1 = right, 2 = down, 3 = left) and the respective map coordinate.
                    # as block size = 1, use the decimal part of map coordinate to find the respective image x coordinate.
                    # resize the map from one line to all lines (as all lines are the same). This will be a (y_num, x_num) array transposed.
                    if side_nr == 0:
                        x_map = (np.resize(((0.999999 - (self.ray_data[x_left:x_right + 1, 2] - np.floor(self.ray_data[x_left:x_right + 1, 2])))
                                            * x_size).astype(np.int16), (y_num, x_num))).transpose()
                    elif side_nr == 2:
                        x_map = (np.resize(((self.ray_data[x_left:x_right + 1, 2] - np.floor(self.ray_data[x_left:x_right + 1, 2]))
                                            * x_size).astype(np.int16), (y_num, x_num))).transpose()
                    elif side_nr == 1:
                        x_map = (np.resize(((0.999999 - (self.ray_data[x_left:x_right + 1, 3] - np.floor(self.ray_data[x_left:x_right + 1, 3])))
                                            * x_size).astype(np.int16), (y_num, x_num))).transpose()
                        x_map = (np.resize(((self.ray_data[x_left:x_right + 1, 3] - np.floor(self.ray_data[x_left:x_right + 1, 3]))
                                            * x_size).astype(np.int16), (y_num, x_num))).transpose()

Now we have two arrays, y_map and x_map, mapping all the pixels of the rectangle. However, some y_map y coordinates are mapped outside the source picture! What we now have is the black rectangle, when we really need the red polygon. Let's figure out which of the pixels are validly mapped:

                    # pick the valid pixels i.e. y is mapped inside the source picture (valid is False when y is negative or greater than image height).
                    valid = (y_map >= 0) & (y_map < y_size)

The valid array is simply combining two True/False arrays, resulting in an array again covering all pixels of the rectangle, with True if the y coordinate is within the source picture and False if it falls outside of it (True representing the red polygon as intended). 

Next, we can calculate shading, if necessary. Shading is applied based on two variables, self.view_blocks and self.view_shade, to shade the rendered image to black when the limit of our visibility is reached. If e.g. self.view_blocks = 10 and self.view_shade = 0.7, the pixels will get a shading from distance 7.0 (still full brightness) to 10.0 (faded to black). This multiplier array (always between 0 and 1) is calculated for each ray using its distance data, and then resized to cover again all pixels of the rectangle.

                    # draw all pixels of destination range but again dropping out the pixels not mapped on the image.
                    # picking the mapped image coordinates from source data; these arrays will be flattened when applying [valid].

                    if np.amax(self.ray_data[x_left:x_right + 1, 4]) > self.view_shade * self.view_blocks:
                        # add shading when objects are far away. Shading applied if max distance (not corrected) > self.view_shade * self.view_blocks
                        shade = (np.resize(np.minimum(1.0, np.maximum(0.0, (self.view_blocks - self.ray_data[x_left:x_right + 1, 4])
                                                                      / (self.view_blocks * (1.0 - self.view_shade)))), (y_num, x_num, 1))).transpose(1, 0, 2)
                        # using shade array to phase the image to black in the distance.
                        rgb_array_dst[x_left:x_left + x_num, y_min:y_min + y_num][valid] = rgb_array_src[x_map[valid], y_map[valid]] * shade[valid]

                        # no shading required
                        rgb_array_dst[x_left:x_left + x_num, y_min:y_min + y_num][valid] = rgb_array_src[x_map[valid], y_map[valid]]

                # if the wall block is not visible, clear the area.
                screen.fill(self.background_color, (self.grid_blocks[i], y_ceiling - 1, self.grid_blocks[i + 1] - self.grid_blocks[i] + 1, y_floor - y_ceiling + 3))

Then, what's left is to actually copy the mapped pixels to the screen - for the valid area and applying shading, if necessary. That's done above: processing the rectangle rgb_array_dst[x_left:x_left + x_num, y_min:y_min + y_num] and using only its [valid] pixels, set them to the colors found in source picture rgb_array_src[x_map[valid], y_map[valid]],  applying  * shade[valid], if necessary.

And, if it was determined in the beginning that the Block is not a wall Block (<= 0), the area is beyond visibility and must be cleared. For that I am using a simple fill. In the image above, the rightmost part of the screen represents such an area.

Casting the Floor and the Ceiling

With the walls ready, we can proceed to rendering the floor and the ceiling. They are technically mirror images of one another, so one solution would suffice. However, I wrote two: One works in a way similar to the walls, selecting valid pixels from a rectangular area; the other, casting the whole rectangle. While the latter has the advantage of not having to bother with defining what's valid and what's not, it has to be used before rendering the walls, as it will otherwise overlap them. The former method can be used either before or after the wall rendering, as there's no overlap.

The two methods are very similar and selecting the valid pixels here is quite a lot like the same for the walls, so I am only explaining the one covering the whole floor or ceiling and not checking if pixels are overlapping the walls. Also, in the code below, I have left out the parts where the ceiling is handled, assuming is_floor=True, to keep it a bit shorter. 

    def draw_floor_or_ceiling_whole(self, screen, block_size, pic, is_floor=True):

        # draw the floor or the ceiling - all in one go.
        # block_size defines how many map blocks the image (pic) should cover in (x, y)
        # if is_floor is False then will draw the ceiling.
        # this version draws the whole floor/ceiling rectangle, not testing for existing walls, so must be run before drawing walls.

        # obtain a surfarray on screen to modify it pixel by pixel with NumPy
        while screen.get_locked():
        rgb_array_dst = pygame.surfarray.pixels3d(screen)

        rgb_array_src = pygame.surfarray.pixels3d(pic)
        bl_x_size = block_size[0]
        bl_y_size = block_size[1]
        x_size = int(np.shape(rgb_array_src)[0] // bl_x_size)
        y_size = int(np.shape(rgb_array_src)[1] // bl_y_size)
        min_block = np.min(self.ray_data[:, 0])  # to test if all blocks are walls i.e. min_block > 0

First I again obtain pygame.surfarray.pixels3d arrays for both the display and the source picture. The function takes an argument block_size, which refers to the floor/ceiling block relative to a map Block - I am using size [2, 2] so that the source picture will cover a two by two map Block area. This can be freely set, but only integers are allowed. The x_size and y_size of the source picture are already adjusted for block_size as well.

        # using the same formula as for calculating y_top and y_bottom for wall distance (see self.raycast())
        # we can also calculate distance for each y_coord. This is the same for all rays / viewing angles.
        y_top = self.height / 2 + (self.wall_height * self.view_height) / self.view_blocks  # this is the most distant visible floor y
        if min_block > 0:
            y_num = int(self.height - max(y_top, np.amin(self.ray_data[:, 7])))  # if floor is cut off before y_top by walls, use the highest cut off point
            y_num = int(self.height - y_top)
        y_coord = np.arange(0, y_num)
        dist = (self.wall_height * self.view_height) / (self.height / 2 - y_coord)
        # find the left and right edges of rectangle (ie. do not draw floor/ceiling if wall covers the whole vertical space left or right)
        x_left = (self.ray_data[:, 7] < self.height).argmax()  # x_left is the first x where y is not below screen
        x_right = self.width - (np.flip(self.ray_data[:, 7]) < self.height).argmax()  # x_right is the last x where  y is not below screen + 1

Then I calculate the y_top of the floor (for the ceiling, we'd have an y_bottom), the most distant (and hence highest) y coordinate of the floor that we can see. This depends on the visibility (self.view_blocks), player height (high or low, self.view_height), the predefined wall height constant (self.wall_height), and of course the screen resolution (self.height). However, if all rays hit Blocks i.e. all vertical lines have walls so min_block > 0, we may not need to cover all lines from y_top downwards. If the highest wall bottom coordinate (np.amin(self.ray_data[:, 7])) is bigger, we'll use that instead for calculating y_num, the number of lines from the bottom of the display to be covered.

y_coord will be a range of y coordinates from the bottom to the highest / maximum distance floor coordinate, and dist will be the distance of those y_coord on the map.

Then, to again limit the area covered if possible, I am checking whether walls actually cover all pixels on the left or right side of the screen, again using the bottom coordinates self.ray_data[:, 7]. x_left and x_right are finding the first x coordinate where the wall bottom is above display bottom line from the left and the right, respectively.

Then we can move to mapping the pixels of this destination rectangle to the source picture:

        if y_num > 1 and x_right > x_left:

            # mapping these to our map coordinates simply use distance above and go linearly from viewer position to wall position for each ray.
            # using np.mod to pick the remainder and scaling that to image coordinate.
            x_map = (np.mod(self.position[0] + (self.ray_data[x_left:x_right, 2:3] - self.position[0]) * dist / self.ray_data[x_left:x_right, 5:6],
                            bl_x_size) * x_size).astype(np.int16)
            y_map = (np.mod(self.position[1] + (self.ray_data[x_left:x_right, 3:4] - self.position[1]) * dist / self.ray_data[x_left:x_right, 5:6],
                            bl_y_size) * y_size).astype(np.int16)

            if dist[-1] / self.distance_correction[0] > self.view_shade * self.view_blocks:
                # add shading when floor/ceiling far away. Shading applied if max distance (not corrected for fishbowl) > self.view_shade * self.view_blocks
                shade = ((np.minimum(1.0, np.maximum(0.0, (self.view_blocks - dist / self.distance_correction[x_left:x_right, None]) /
                                                     (self.view_blocks * (1.0 - self.view_shade))))))[:, :, None]

                rgb_array_dst[x_left:x_right, -1:-1 - y_num:-1] = (rgb_array_src[x_map.ravel(), y_map.ravel()]).reshape(x_right - x_left, y_num, 3) * shade
                # no shading needed
                rgb_array_dst[x_left:x_right, -1:-1 - y_num:-1] = (rgb_array_src[x_map.ravel(), y_map.ravel()]).reshape(x_right - x_left, y_num, 3)

Both x_map and y_map are calculated the same way: linearly interpolating between the player position and the wall position on the map using the above calculated distance (dist) for each y_coord. If you think of it on the game map picture, we are going from player position to the maximum distance for each ray on the yellow viewing area (although this may be a smaller subset as described). Then the resulting map coordinates are converted to source picture pixels taking the modulo using block_size (resulting, if block_size [22], in a number between 0 and 2, 2 excluded) and scaled using x_size/y_size.

Shading can be applied, when needed, in the same way as for the walls. The actual copying the colors from source to destination looks a bit more complicated here, because the destination array is a simple NumPy array using basic indexing. When, for the walls, we selected only the [valid] pixels, that resulted in a "flat" indexing array (see advanced indexing) on both sides. Here, I need to make the source array match the destination shape, and that is done by first using ravel() to make the mappings flat, and then reshaping the resulting array to the same shape as the source. As NumPy cleverly keeps the actual data and its shape (the metadata) separate, such reshaping can usually be done very quickly by just changing the metadata.


I have not covered how the Items (coins and light bulbs) are processed, but that is relatively simple. The "engine" could of course be used to create something else like a Doom style game, and it would be rather trivial to add levels by simply changing the map after the completion of one or other Items. However, for me the main point was to try using Numpy to cleverly (hopefully) render large blocks of the screen in one go, and trying to avoid slow-running Python loops.

I hope you enjoyed this!


  1. Puede funcionar como un laberinto de dudas, para ir resolviendo

  2. Hi there, very cool project.

    Was wondering if you could help me with a 2d shooter. I am stuck on finding the angle to rotate gun. I want it to be capable of 360 degrees. Basically points to where the mouse is. My email is: athherzog@gmail.com
    Your help would be much appreciated. Please email me back!