This article is about the engineering side of image slide puzzles: how an arbitrary photo becomes a playable N×N puzzle. It is the part players never see and developers spend more time on than they expect.
The pipeline
Three stages, in order:
1. Square cropping
Most photos are not square. iPhone HEIC files are 4:3; iPhone photos taken in portrait are 3:4. A slide-puzzle board is 1:1. Step one is choosing which square region of the photo to play.
Two common approaches:
Centre crop. Take the largest centred square that fits. Fast, predictable, occasionally wrong (subject not in the centre).
Interactive crop. Show the user a draggable square overlay. Let them choose. Slower, but the user always gets what they want.
Slide Puzzle uses interactive cropping, defaulting to centre. The crop is non-destructive — the original file in the user's photo library is never modified.
2. Resize to working resolution
Once cropped, the image is still typically 3000×3000 or larger (depending on the camera). For puzzle play this is overkill. A 6×6 board displayed at 320pt on a 3× iPhone screen renders each tile at about 53pt = 160 device pixels. A source resolution of 1024×1024 gives a per-tile resolution of about 170 pixels — sharper than the display can show.
Resizing to 1024 (or sometimes 2048) is the standard. Bigger sources waste memory and slow loading without improving the visible result.
3. On-demand tile rendering
This is the part most people get wrong on first attempt. The image is not actually cut into 16 separate files. That would be slow, wasteful, and unnecessary.
Instead, the app renders each tile by drawing the same source image into a tile-sized rectangle, with the source rectangle offset by the tile's position in the goal image. CSS calls this a "background-position" trick; iOS calls it a CGImage clip rectangle; SwiftUI uses a Rectangle().clipped() on an Image.
In pseudocode, drawing tile at goal position (row, col) of an N×N board with image of side S:
draw image at (-col * S/N, -row * S/N)
within a clip rectangle of (S/N × S/N)
That's it. Sixteen tiles for a 4×4 means 16 calls to draw the same source image with 16 different offsets. The app never needs to cut the file.
This is also why slicing a 6×6 board is instant: it is 36 calls to the same renderer, not 36 file operations.
What about animation
Sliding a tile is a translate transform on the rendered tile. The image content inside the tile does not change — only the position of the clip-rectangle frame on screen. This means the slide animation is GPU-trivial and runs at 120 Hz on ProMotion displays.
Three implementation details that matter:
- The clip and the content are siblings, not parent/child. If they are nested, the content moves with the clip and the slide breaks.
- Use transform, not layout. Animating x/y via CSS transform or SwiftUI offset is GPU-accelerated; animating left/top is not.
- Pre-rasterise on first appearance. The first frame of a tile rendering can be slow because the source image must be decoded. Render all tiles once at game start to warm them up.
Storage
Where does the imported image actually live?
In a privacy-respecting app, in the app's sandbox, encrypted at rest by iOS. The photo library still holds the original; the app stores a working copy at 1024×1024 inside its own Documents folder. When the user deletes the app, the working copy is deleted with it.
In a cloud-based app, the working copy lives on a server somewhere. The implications are very different — for privacy, for offline play, and for what happens when the server goes away. Slide Puzzle is the sandbox kind.
Memory budget
A typical 4×4 image slide puzzle:
- Source image: ~3 MB JPEG.
- Decoded image in memory: 1024 × 1024 × 4 bytes = 4 MB.
- One copy per active game.
That is fine. At 6×6, the source image and decoded buffer are the same size. The board does not need more memory.
Where memory budgets get tight is for cover libraries — 300 covers × 4 MB = 1.2 GB if all decoded at once. Apps avoid this by decoding on demand (only when a cover is shown) and releasing on backgrounding (only the active game's image is kept in memory).
Edge cases
Three things go wrong in practice:
Photos with EXIF orientation tags. A photo taken in portrait but stored landscape with a rotation tag will display correctly in the photo library and wrongly in the puzzle if the app forgets to apply the EXIF rotation. The first version of every photo-puzzle app has this bug.
Very large source images. Some HEIC files are 6000×8000 pixels. Loading these into memory at full resolution will crash an app on smaller iPhones. The fix is streaming decode at a downsampled target size — Apple's ImageIO supports this. Decode at 2048×2048 directly from disk; never decode at full size.
Sub-pixel rendering. Tile sizes that don't divide evenly into the screen pixel grid produce a fractional pixel column on one side of each tile. With pure clipping, this shows as a hairline gap. Fix by either snapping tile sizes to integer pixels (visible jitter at non-standard board sizes) or letting tiles overlap by 0.5 pixel (no gap, no jitter).
These are not deep problems but they are uniformly forgotten by first-time implementers.
Summary
The pipeline is: crop → resize → clip-and-translate to render tiles → translate transforms to animate slides. The image never leaves the device in a privacy-respecting app. The whole pipeline fits in about 200 lines of code, plus another 50 for the EXIF rotation handling that everyone forgets the first time.
Image slide puzzles are simpler than they look from the outside. The image stays one image; the tiles are just framed views of it.