PDA

View Full Version : Ripping data from .afs format


Chu
2007-09-11, 01:42
There's a game I'm trying to extract data from (audio, CGs, script, the works) but they're all packed in .afs, which I don't know anything about. I tried AFS extract but all of the CGs are in .cel format, so I can't view them. Is there a better program, or a way to convert the images to a viewable format?

Asceai
2007-09-11, 01:56
I've never even heard of that one before. Throw a .cel or two this way. (and a PNG of the image as a screenshot, if you can guess which one it is from the scripts you extracted or something)

Chu
2007-09-11, 02:01
I have absolutely no clue which images are which ^^; So there's some guess work involved.

http://www.u-chan.org/spare/C11B2M2.cel

Minagi
2007-09-11, 20:38
I think that might be an animation file and not CG. What other files are in the game?

Quick look at it and it's:

Header (0xf0 bytes, don't care enough to figure out what everything means)
I'm going to guess 0xC is the amount of files in the cel file, but I can't tell from just one file.


IMAG block, 32 bytes:
IMAG 4 byte
End of LZ file 4(maybe 8?) bytes
(24 bytes I don't feel like figuring out)

LZSS File:
LZ 2 bytes
Uncompressed file size 4(6?) bytes
Size of file without header 2(4?) bytes
LZSS data

Something like that.

zalas
2007-09-11, 22:43
Upon quick glance, I think that DWORD after the chunk name doesn't necessarily mean the end of the LZ chunk. I think it's a generic chunk length thing, and it points to the next chunk in the list of chunks. Why they chose to do it that way instead of length beats me. Maybe they intended to load the entire thing into memory...

As for the first header, it has 0100 0300 0200 0800. Not sure about the 0100, but the 0300, 0200 and 0800 probably refer to the number of ENTR, ANIM and IMAG chunks, respectively. The DWORD at offset 0x10 looks like the size of the file and the DWORD following might be the size of the file after decompression of all the chunks? It could also be a CRC32, but who knows...

EDIT: It's probably more likely the CRC32, since the files don't really expand to something that large. I find it funny how the LZ chunks are stored in bigendian format while the rest of the file is in little endian.

Asceai
2007-09-12, 04:04
Post more files. As a general rule, full-screen CG is often ~1mb or so, depending on the resolution. These .CEL files seem to be tiny on-screen animations, like what haruoto and ef use.

Chu
2007-09-12, 10:46
It looks like it packed animations and CGs all in the same .afs, so I just picked out the largest one and uploaded it.

http://www.u-chan.org/spare/EV083_1.cel (assuming this one is an event CG)
http://www.u-chan.org/spare/BG002_1.cel (assuming this one is a background CG)

Minagi
2007-09-12, 11:23
It looks like this now:

IMAG block
IMAG 4 bytes
Start of next chunk/end of current chunk (it's still there even with 1 file) 4 bytes
Null 4 bytes
Height 2 bytes
Width 2 bytes
? 12 bytes
Size of file 4 bytes

(When LZ is present)
LZ header
LZ 2 bytes
4 bytes big endian decompressed file size (probably wouldn't have noticed it was in big endian so thanks zalas :3)
4 bytes big endian compressed file size
LZ Data (Maybe it's not standard? It looks messed up after decompression)

Messing around with the uncompressed file in InfranView, I was able to come up with this from BG002_1.cel with the settings 640x480 32 BPP BGR Interleaved:
http://img210.imageshack.us/img210/7601/file2px2.png

From EV083_1.cel (doesn't use an LZ file, it's a straight up raw image). Same settings as BG002_1.cel except it's 640x960:
http://img210.imageshack.us/img210/1999/eveb6.png

Chu
2007-09-12, 12:50
How exactly did you get them like that? I downloaded the program but I just get errors when I try to open them.

Er. Also, is there possibly an easier way to do that? ^^;; because I have like...six hundred of these buggers.

Minagi
2007-09-12, 14:33
Try this: http://www.sendspace.com/file/ci8513

It'll extract both LZ and raw files from .cel files.

Usage: cel.exe [flag] (files)

flags:
x - extract
d - decompress (only works on LZ files)

You can use a wildcard to open the files, so cel.exe x *.cel and then cel.exe d *.lz would work best for processing a lot of files. A small note though, things that use the LZ compression might come out a little funny. I used the standard LZ code (besides a small part I added in to skip the header) but I think they might've changed things up in their compression.

What game is this for anyway?

Chu
2007-09-12, 15:35
I'm not exactly sure how to use this program... When I give it a cel, it says 'could not open'.

The game is Akai Ito. Technically the event CGs have already been ripped, I'm mostly trying to get at the character images.

zalas
2007-09-12, 16:38
Hahah, amusingly enough the first time I saw AFS was in Akai Ito. Still haven't fully figured out how to re-encrypt the data files so that the trial version would run the full version files. If that were finished, then an Akai Ito translation that would run on PC wouldn't be out of the question <_<;

I guess if I have time I could dig around the PC executable to find the LZ decompression code. Though, I really should get more work done on my image code for Python so that I can write plugins for various game formats in it and have it be able to export/import TIFF files or something -_-;

Chu
2007-09-12, 17:07
Hahah, amusingly enough the first time I saw AFS was in Akai Ito. Still haven't fully figured out how to re-encrypt the data files so that the trial version would run the full version files. If that were finished, then an Akai Ito translation that would run on PC wouldn't be out of the question <_<;

Ahaha, that was *exactly* what I was going to try and do! Since someone is already translating it (and finished two character routes) I thought that if I could figure out how to run it on the PC I could edit the script.

Minagi
2007-09-12, 18:47
I'm not exactly sure how to use this program... When I give it a cel, it says 'could not open'.

The game is Akai Ito. Technically the event CGs have already been ripped, I'm mostly trying to get at the character images.
Throw the exe in the same folder as the .cel files. That's how I always ran it.

zalas
2007-09-12, 19:12
http://img207.imageshack.us/img207/4762/testns5.th.png (http://img207.imageshack.us/my.php?image=testns5.png)
for the first image in C11B2M2.

That image is an index-color image. The first four bytes are 00 01 00 00. The next 1024 bytes are BGRA tuples, each 1 byte each, for the palette entries. The rest is the image data, 1 byte per pixel. The image I posted does not have the color-palette on and is just using a default grayscale mapping.

http://img405.imageshack.us/img405/5046/bg0021av7.th.png (http://img405.imageshack.us/my.php?image=bg0021av7.png)

BGRA ordering

Basically the LZ algorithm is as follows, as far as I know:

[L][Z][decompressed size:DWORD(BE)][compressed size:DWORD(BE)]

The LZSS implementation uses a 0x400(1024) byte window, with the starting offset at 0x3EE into the window (where the first lookback data will be written).

Flag bytes are used to denote lookback(bit=0) or verbatim(bit=1). Bits are read from LSB in the flag byte.

For verbatim, one byte is copied and placed into window

For lookback, two bytes are read. Let's say the two bytes are byte0 and byte1. The start of the read-back from the window is at offset ((byte1&0xf0) << 4) | byte0. The length of the read-back is (byte1 & 0xf)+3.

Hope this helps.

EDIT: So what's this about a translation already done for 2 routes? The ROFS CVM format is kind of annoying for Akaiito in particular because it uses a scrambled form. roxfan figured out the descrambling needed, but I still don't know exactly what fields correspond to what in the CVM and have been too busy to sit down and figure it out. Akaiito's code is massively painful to go through, since it uses lots of optimizations -_-;

Minagi
2007-09-12, 20:03
That helped a lot, zalas, thanks. :)

The images that require 8 bit have 1028 (around there) bytes of random data at the beginning. If that's skipped, it'll read the image like normal (as pictured here). Is this the same for you or can you get it to decompress without that?

http://img174.imageshack.us/img174/1346/file70640x4802vd3.png

I think in the IMAG header part of the unknown 12 bytes tells what the type is.
Like 0x05 meaning 8 bit and 0x7 meaning 32 bit. That's what I've noticed at least, all of the 8 bit images were 0x5 and all of the 32 bit images were 0x7.

Here's the new version of the tool I'm using: http://www.sendspace.com/file/xc1poy

zalas
2007-09-12, 21:00
Reread what I wrote in the previous post ;)
It's 4 byte header and 256 BGRA tuples for the palette.

Minagi
2007-09-12, 21:19
I didn't even notice that part. XD; That makes sense to me now. :)

Edit: Yay! Here we go.

http://img489.imageshack.us/img489/3172/file4uf1.png http://img472.imageshack.us/img472/7518/c11zz9.png

After like 2 hours of unsuccessfully trying to make a program that'll convert from raw to bmp (kept having some issues the palette >.<), I've decided to just upload the version that outputs the colored raw files.
http://www.sendspace.com/file/8rpt0h

This outputs a (file)_color.raw which can be opened with InfranView using the settings 32 BPP, RGB, Interleaved.

Chu
2007-09-13, 02:06
EDIT: So what's this about a translation already done for 2 routes? The ROFS CVM format is kind of annoying for Akaiito in particular because it uses a scrambled form. roxfan figured out the descrambling needed, but I still don't know exactly what fields correspond to what in the CVM and have been too busy to sit down and figure it out. Akaiito's code is massively painful to go through, since it uses lots of optimizations -_-;

http://digitalmedia.arts.ufl.edu/~kmcgo/Akaiito/ ^_^

Asceai
2007-09-13, 03:38
Well, shit, I'd finally managed to figure out the backreference format for that LZSS stuff and someone else has already done it. Oh well =p.

quick, now to figure out reinsertion before you do~ (just kidding, I wouldn't be able to test reinsertion anyway =p)

zalas
2007-09-13, 07:12
Ah, thanks for the link, Chu. So I guess it's an old-skewl non-patch translation. Amusingly enough, I think you can run Akaiito with decent speed on a recent PC using pcsx2.

Chu
2007-09-13, 19:04
Ah, thanks for the link, Chu. So I guess it's an old-skewl non-patch translation. Amusingly enough, I think you can run Akaiito with decent speed on a recent PC using pcsx2.

Actually no ;.; Well. The game itself runs fine, but the sound quality is HORRIBLE. It's slow and splicey and ew. Then when you get to the opening video it crashes.

I'm waiting for pcsx4. Hopefully it'll run better then.

Wah X_x I wish there was a simpler way to extract these CGs...this is going to suck.

Asceai
2007-09-14, 02:36
Well, if you want a different extraction tool, you'll have to say exactly what you want. Do you want a program to output BMPs or PNGs or something instead? Or.. what, exactly? I should be able to whip up a C program to do anything like that in a few minutes.

Chu
2007-09-14, 03:15
I prefer PNGs, actually. Good quality and the file size isn't outrageous.

Thanks for all the help, by the way =D

Asceai
2007-09-15, 08:36
Okay, I've done it

get it here (http://www.asceai.net/files/cp10x.zip)

some things to note:
- I still don't know this file format, so there's probably heaps of other crazy chunks in there, but it only really looks at IMAG and ENDC anyway
- Since I've only got the three input files you posted, I haven't been able to test it very widely. If you find something it can't handle, throw it at me.
- I made a number of guesses, like the fact that the mystery bit between the pixel format and image/chunk size is used to say whether it's lzss or not. That may not actually be true, but this program works for those images you uploaded.
- This program produces a lot of output. Most of it will probably be useless, but you can at least see what it's doing if it screws up.

Asceai
2007-09-16, 17:20
Bump, so that the person who requested this actually sees it.

Chu
2007-09-16, 21:26
Thanks!

Oh my god O_O

So I think I got them all converted, and I know what the little dinky files are now XD

Apparently the eyes and mouth of every character in every pose is a different .cel file.

Imagine my shock when I went to look for my favorite character's .cel file and saw she had no eyes or mouth D:

OMG SHE'S SOULLESS.

Asceai
2007-09-17, 13:42
the ANIM chunks probably contain the information about how those things are all composited together, if anyone wanted to try and work it out. you could probably create animated gifs from them or something

Chu
2007-09-17, 14:45
I can't do it myself so I guess I'll wait around and see if anyone wants to take a stab at it XD In the meantime I'm having fun playing Mr. Potato Head with Yumei.

Asceai
2007-09-17, 19:30
It's probably pretty simple, but for now I'm a bit too busy. I might have a go later, though.

EDIT: It would do me a big favour if you zipped a few more .cels, preferably ones that seem to be character images, and sent them to me (either via forum upload, or rapidshare or anything like that) The problem is at the moment I only have that one animated .cel you uploaded (the first one).

Chu
2007-09-19, 13:20
Here's a couple of Yumei's cels.

http://www.sendspace.com/file/r1sqnv

Asceai too lazy to log in
2007-09-19, 17:13
Well, the anim chunks only seem to contain the frames to show for that animation and the delay between frames. As of yet I have not found anything useful for actually positioning the images for drawing the animation. Perhaps the ENTR chunks are worth a look (after all, the images that are made up of three parts seem to have three ENTR chunks).

Unregistered
2007-09-19, 17:17
Yes, it's definitely in the ENTR chunks. Fantastic..

Asceai too lazy to log in
2007-09-19, 19:39
Okay, try this:

http://www.asceai.net/files/cp10d.zip

It's not a conversion program (I couldn't really do one anyway for this kind of thing), it's a displayer. I probably got the speed totally out (for some reason either the eyes blink too slowly or the mouth moves too fast) but I think I understand the ENTR and ANIM formats fairly well now.

Chu
2007-09-19, 20:18
It works well enough for my purposes (getting the actual image). Thanks!

Rasqual Twilight
2007-12-27, 17:25
Just out of curiosity, did anyone manage to reinsert the full version files into the PC demo version?

casuaa
2008-01-21, 09:57
Was anyone able to extract the script and dump it back in?

I was interested in translating the Aoi Shiro demo by the same company and it seems they also use the .afs format.

The demo is the 3rd link (242 MB)
http://www.distribution.ne.jp/game/list/success/aoishiro/

zalas
2008-01-21, 13:52
Since the demo uses unpacked .AFS files, I wonder if this means the demo will be able to run the full version without a PS2...

casuaa
2008-01-22, 09:33
That'd be cool if it could do that.
I'll have to try once I get the game. :)

casuaa
2008-04-27, 09:02
This might not apply for Akaiito, but for the Aoishiro demo, I was able to translate some of it by extracting the .dat files from SCRIPT.AFS using AFSExtract7 1.3 and manually editing it with a hex editor... >_<

I couldn't figure out the character encoding was used, but it's using 2 bytes for Japanese characters, and one byte for some of the English alphabet.

Ex.
9F = a BF = A
86 = z A6 = Z
7FA4 = long dash

The only problem is that I couldn't figure out how to increase the space allocated for characters.

Ex.

FC 00 indicates the next four bytes is for a name, but you can't just change it to FA 00 and add two more bytes to the file, it gives a script error...

Here's a screenshot:
http://usera.imagecave.com/casuaa/aoshiroscreenshot.jpg

First chapter translated:
http://www.sendspace.com/file/7lii9x

Rasqual Twilight
2008-09-30, 16:25
EDIT: So what's this about a translation already done for 2 routes? The ROFS CVM format is kind of annoying for Akaiito in particular because it uses a scrambled form. roxfan figured out the descrambling needed, but I still don't know exactly what fields correspond to what in the CVM and have been too busy to sit down and figure it out. Akaiito's code is massively painful to go through, since it uses lots of optimizations -_-;

Am I wrong when I follow hints of a particular source saying the CVM file is merely an ISO-9660 file with a 0x1800-byte header, plus (in this case) scrambling of the filesystem descriptor (no Joliet AFAICT)?

I think I could locate the descrambling part, but is there a working implementation already?