Difference between revisions of "FF7/LGP format"

From Final Fantasy Inside
< FF7
Jump to navigation Jump to search
my_wiki>Halkun
 
(Section 2: Section formerly designated as "CRC Code")
 
(14 intermediate revisions by 9 users not shown)
Line 1: Line 1:
=== LGP Archive format for PC by [[User:Ficedula|Ficedula]] ===
+
=== LGP Archive format for PC by [[User:Ficedula|Ficedula]] ===
  
This section explains how the LGP archives from FF7PC are constructed. There's probably no reason why you'd need to know this (Plug: Use my [http://www.ficedula.com LGP Editor] !) but the file format might be useful to SOMEBODY.
+
This section explains how the LGP archives from FF7 PC are constructed. If you're looking for a tool that already manages LGP archives, try [[User:Ficedula|Ficedula]]'s [http://sylphds.net/f2k3/index.html LGP Editor].
  
 
Essentially the LGP file is split up into four (maybe less, depending on how you count it) sections.
 
Essentially the LGP file is split up into four (maybe less, depending on how you count it) sections.
Line 10: Line 10:
 
# File terminator
 
# File terminator
  
==== Section 1: File Header ====
+
==== Section 1: File Header ====
  
 
This contains two parts: A header of fixed size, then the table of contents.
 
This contains two parts: A header of fixed size, then the table of contents.
  
The first item is 12 bytes containing the file creator. This is a standard string, except it is "rightaligned". In other words the blank space comes BEFORE the actual text, not after. Oh: In FF7 it's always "SQUARESOFT" preceded by two nulls to make it 12 bytes. The only other thing you might see is the header "FICEDULA-LGP", which I use to indicate a file is an LGP *patch* one of my programs has constructed, not a complete archive.
+
The first item is 12 bytes containing the file creator. This is a standard string, except it is "rightaligned". In other words the blank space comes before the actual text, not after. In FF7 it's always "SQUARESOFT" preceded by two nulls to make it 12 bytes. The only other thing you might see is the header "FICEDULA-LGP", which I use to indicate a file is an LGP *patch* one of my programs has constructed, not a complete archive.
  
 
Next is a four-byte integer saying how many files the archive contains.
 
Next is a four-byte integer saying how many files the archive contains.
Line 22: Line 22:
 
Each entry in the TOC has the following structure:
 
Each entry in the TOC has the following structure:
  
{| border="0" cellpadding="3" cellspacing="1" style="background: rgb(0,0,0)" align="center"
+
{| class="wikitable"
! style="background:rgb(204,204,204); width:80px;" align="center" | Offset
+
! Offset
! style="background:rgb(204,204,204); width:200px;" | Length
+
! Length
 
|-
 
|-
|style="background:rgb(255,255,255);" | 20 bytes
+
| 20 bytes
|style="background:rgb(255,255,255);" | Null terminated string, giving filename  
+
| Null terminated string, giving filename
 
|-
 
|-
|style="background:rgb(255,255,255);" | 4 byte integer
+
| 4 byte integer
|style="background:rgb(255,255,255);" | Position in this file where data starts for the file
+
| Position in this file where data starts for the file
 
|-
 
|-
|style="background:rgb(255,255,255);" | 3 bytes
+
| 1 byte
|style="background:rgb(255,255,204);" | Some sort of check code. Normally seems to be<br />14,0,0 but it does vary. Unsure about this.  
+
| style="background: rgb(255,255,204)" | Some sort of check code. File attributes? Normally seems to be<br />14 but it does vary.
 +
|-
 +
| 2 byte short
 +
| style="background: rgb(255,255,204)" | Something to do with duplicate file names. If a name is unique it is 0, otherwise it is assigned a value based on existing duplicates. (Hard to explain)
 
|}
 
|}
  
Simple!
+
====  Section 2: Section formerly designated as "CRC Code"  ====
 +
 
 +
This section is 3600 bytes.  It is 30 sets of 30 entries containing two 16-bit words each (30 x 30 x 2 x 2 = 3600)
 +
 
 +
The sets contain file-group information which is based on the first two letters of each file name.
 +
 
 +
The first letter, minus the value for ascii 'a' (0x61) is the index of the set to which the file belongs.
 +
 
 +
The second letter, minus the value for ascii ' ' ' (0x60) is the index of the entry within the set.  Since the second letter in all of the file names is ascii 'a' or greater, it means that the lowest entry index is 1, so the first entry (at index 0) in every group is always zero (0x0000).
 +
 
 +
Each entry is two words.
 +
 
 +
The first word is the 1-based index of the directory entry for the first file in the set.
  
==== Section 2: CRC Code ====
+
The second word is the number of files in the set, most of which are 0x003c (60).  There are a few entries after the bulk which have fewer entries.
  
This code is used to validate the LGP archive. The bad news is I have no idea how to make it (I've figured out how to decode it, ie. find out whether the archive is valid ... but I can't create my own). The good news is you don't need to! The ONLY thing this CRC is based on is the number of files in the archive (maybe the filenames too ... haven't checked that). Anyway, the TOC is the only thing this check relates to. So if you're replicating an archive from FF7 for use in the game with the same number of files and filenames (and what ELSE would you use LGP archives for?) you can just copy the CRC section from an existing file. Cheap but effective :)
+
The meaning of these sets and why they're divided in this manner is yet to be determined.
  
Normally it's 3602 bytes long. I think one archive was different? Maybe MAGIC.LGP - can't remember. Anyway, one normally-safe way of calculating the CRC size is to find the end of the TOC and the beginning of the first file. Anything in between is probably CRC code. (Not guaranteed to work! It works with "official" archives but editors - such as mine - can screw around with the TOC to achieve extra things).  
+
There is one 16-bit word with the value of 0 (0x0000) at the end of this data which may belong to this section or the next.
  
==== Section 3: Actual Data ====
+
==== Section 3: Actual Data ====
  
The data from the files. However it's not that simple: the TOC doesn't list how long each file is (somewhat useful!). It's done here. The offset in the TOC is actually the position of yet another file header. Format is:  
+
The data from the files. However it's not that simple: the TOC doesn't list how long each file is (somewhat useful). It's done here. The offset in the TOC is actually the position of yet another file header. Format is:
  
{| border="0" cellpadding="3" cellspacing="1" style="background: rgb(0,0,0)" align="center"
+
{| class="wikitable"
! style="background:rgb(204,204,204); width:80px;" align="center" | Size
+
! style="background: rgb(204,204,204); width: 80px" align="center" | Size
! style="background:rgb(204,204,204); width:200px;" | Description
+
! style="background: rgb(204,204,204); width: 200px" | Description
 
|-
 
|-
|style="background:rgb(255,255,255);" | 20 bytes
+
| 20 bytes
|style="background:rgb(255,255,255);" | Null terminated string, giving filename  
+
| Null terminated string, giving filename
 
|-
 
|-
|style="background:rgb(255,255,255);" | 4 bytes
+
| 4 bytes
|style="background:rgb(255,255,255);" | File length
+
| File length
 
|-
 
|-
|style="background:rgb(255,255,255);" | Varies
+
| Varies
|style="background:rgb(255,255,255);" | The file data itself
+
| The file data itself
 
|}
 
|}
  
Simple!
+
====  Section 4: Terminator  ====
 +
 
 +
After the last piece of data comes the file descriptor. This is a simple string, except instead of being null-terminated it's terminated by the end of the file. It's "FINAL FANTASY 7" for all archives, except LGP patches, where it's "LGP PATCH FILE".
  
 +
====  Notes  ====
  
==== Section 4: Terminator ====
+
The game is remarkably flexible about LGP archives. So long as the TOC and the CRC data is intact it'll accept just about anything.
  
After the last piece of data comes the file descriptor. This is a simple string, except instead of being null-terminated it's terminated by the end of the file. It's "FINAL FANTASY 7" for all archives, except LGP patches, where it's "LGP PATCH FILE".
+
* Example 1: The filename in the TOC and in the actual file header don't have to match. It only checks the TOC.
 +
* Example 2: You can point two entries in the TOC at the same data and it works.
 +
* Example 3: You can have ANY junk in the data section so long as all the TOC entries point to a valid file header. Not every piece of data has to be "accounted" for by the TOC. There can be data not used.
  
==== Notes ====
+
[http://www.ficedula.com/ LGP Editor] uses this to its advantage in the Advanced Editor. If you want to replace a file in an LGP archive with your own copy, it just puts the file on the end of the LGP, writes a new file terminator, and updates the TOC to point at the new file. It even lets you link two TOC entries to the same data or have "inactive" files in the archive that aren't referenced by any TOC entry.
  
The game is remarkably flexible about LGP archives. So long as the TOC and the CRC data is intact it'll accept just about anything.
+
I don't know whether the file terminator has to be intact, but for safety's sake my editor preserves it. The CRC must be present and correct. Also, if you're replacing an archive with you're own custom version make sure it has filenames in the TOC matching the ones in the old one.
  
* Example 1: The filename in the TOC and in the actual file header don't have to match. It only checks the TOC.  
+
The game doesn't check archive sizes as long as all filenames are present. So if you want, you could replace an archive containing 95 files with a 98-file archive, so long as 95 of those 98 names matched those present in the original 95-file archive. (However there's no point in doing this when the game won't use any files other than the 95 it's expecting to find).
* Example 2: You can point two entries in the TOC at the same data and it works.
 
* Example 3: You can have ANY junk in the data section so long as all the TOC entries point to a valid file header. Not every piece of data has to be "accounted" for by the TOC. There can be data not used.  
 
  
My LGP Editor uses this to its advantage in the Advanced Editor. If you want to replace a file in an LGP archive with your own copy, it just sticks the file on the end of the LGP, writes a new file terminator, and updates the TOC to point at the new file. (Advantage: Fast). It even lets you link two TOC entries to the same data :) or have "inactive" files in the archive that aren't referenced by any TOC entry.
+
There are reports on [http://forums.qhimm.com/ Qhimm's board] that once you've altered an archive and the game refuses to read it, it won't ever read it until you reinstall - even if you fix the problem/restore from a backup. The idea was generally scorned and ignored, but I'll mention it because something like that happened to me. No solid conclusion can be drawn here.
  
I don't know whether the file terminator has to be intact, but for safety's sake my editor preserves it. The CRC DEFINTELY has to be present and correct. Also, if you're replacing an archive with you're own custom version make sure it has filenames in the TOC matching the ones in the old one, ne?
+
Sometimes, there are data "gaps" in the file that don't appear to be referenced by any file - even by an inactive file. If you're only using the TOC method to get at files (the easy way) then you won't notice this anyway. However, if you're stepping through the file header by header, even reading the unused ones, this can cause problems. If you use my program to update a file with one that's smaller than the original (can happen) then it writes it in, but leaves a gap after it (of course). However, to help you out, after the end of the file, it writes a 4 byte integer saying how much more space to skip over to reach the next file header. This really doesn't affect many things - only tools (like my Advanced LGP Editor) that bypass the TOC to construct their own file lists. FF7 never notices a thing.
  
Oh: The game doesn't check archive sizes so long as all filenames are present. So if you want, you could replace an archive containing 95 files with a 98-file archive, so long as 95 of those 98 names matched those present in the original 95-file archive! (There's no point in doing this - after all, the game won't use any files OTHER than the 95 it's expecting to find).
+
=== Useful downloads ===
  
Other point: I've heard reports on Qhimm's message board that once you've f***ed an archive and the game refuses to read it, it won't EVER read it until you reinstall - even if you fix the problem/restore from a backup. The idea was generally scorned and ignored, but I'll mention it because something like that happened to me. Then again, it COULD have been because I upgraded basically everything in my PC; so no solid conclusion to be drawn here.
+
Below there are links to known programs that are capable to edit LGP archives:
  
Further point: (due to changes in my LGP Tools/Cosmo programs) Sometimes, there are data "gaps" in the file that don't appear to be referenced by any file - even by an inactive file. This happens due to the way my programs update archives. If you're only using the TOC method to get at files (the easy way) then you won't notice this anyway. However, if you're stepping through the file header by header, even reading the unused ones, this can cause problems. If you use my program to update a file with one that's smaller than the original (can happen) then it writes it in, but leaves a gap after it (of course). However, to help you out, after the end of the file, it writes a 4 byte integer saying how much more space to skip over to reach the next file header. This really doesn't affect many things - only tools (like my Advanced LGP Editor) that bypass the TOC to construct their own file lists. FF7 never notices a thing.
+
* [http://www.sylphds.net/f2k3/programs/lgptools/lgptools160.zip LGP Tools] - with an Advanced LGP Editor allowing edit archive thoughoutly
 +
* [http://elentor.com/Projetos/FF7-Tools/Extracting/Emerald.zip Emerald] - has mass extracting/repacking function
 +
* [http://mirex.mypage.sk/index.php?selected=1#Unmass Unmass] - general file extractor with LGP archives support

Latest revision as of 06:53, 27 August 2022

LGP Archive format for PC by Ficedula

This section explains how the LGP archives from FF7 PC are constructed. If you're looking for a tool that already manages LGP archives, try Ficedula's LGP Editor.

Essentially the LGP file is split up into four (maybe less, depending on how you count it) sections.

  1. File header/Table of contents
  2. CRC code
  3. Actual data
  4. File terminator

Section 1: File Header

This contains two parts: A header of fixed size, then the table of contents.

The first item is 12 bytes containing the file creator. This is a standard string, except it is "rightaligned". In other words the blank space comes before the actual text, not after. In FF7 it's always "SQUARESOFT" preceded by two nulls to make it 12 bytes. The only other thing you might see is the header "FICEDULA-LGP", which I use to indicate a file is an LGP *patch* one of my programs has constructed, not a complete archive.

Next is a four-byte integer saying how many files the archive contains.

Following this is the table of contents (TOC): One entry per file.

Each entry in the TOC has the following structure:

Offset Length
20 bytes Null terminated string, giving filename
4 byte integer Position in this file where data starts for the file
1 byte Some sort of check code. File attributes? Normally seems to be
14 but it does vary.
2 byte short Something to do with duplicate file names. If a name is unique it is 0, otherwise it is assigned a value based on existing duplicates. (Hard to explain)

Section 2: Section formerly designated as "CRC Code"

This section is 3600 bytes. It is 30 sets of 30 entries containing two 16-bit words each (30 x 30 x 2 x 2 = 3600)

The sets contain file-group information which is based on the first two letters of each file name.

The first letter, minus the value for ascii 'a' (0x61) is the index of the set to which the file belongs.

The second letter, minus the value for ascii ' ' ' (0x60) is the index of the entry within the set. Since the second letter in all of the file names is ascii 'a' or greater, it means that the lowest entry index is 1, so the first entry (at index 0) in every group is always zero (0x0000).

Each entry is two words.

The first word is the 1-based index of the directory entry for the first file in the set.

The second word is the number of files in the set, most of which are 0x003c (60). There are a few entries after the bulk which have fewer entries.

The meaning of these sets and why they're divided in this manner is yet to be determined.

There is one 16-bit word with the value of 0 (0x0000) at the end of this data which may belong to this section or the next.

Section 3: Actual Data

The data from the files. However it's not that simple: the TOC doesn't list how long each file is (somewhat useful). It's done here. The offset in the TOC is actually the position of yet another file header. Format is:

Size Description
20 bytes Null terminated string, giving filename
4 bytes File length
Varies The file data itself

Section 4: Terminator

After the last piece of data comes the file descriptor. This is a simple string, except instead of being null-terminated it's terminated by the end of the file. It's "FINAL FANTASY 7" for all archives, except LGP patches, where it's "LGP PATCH FILE".

Notes

The game is remarkably flexible about LGP archives. So long as the TOC and the CRC data is intact it'll accept just about anything.

  • Example 1: The filename in the TOC and in the actual file header don't have to match. It only checks the TOC.
  • Example 2: You can point two entries in the TOC at the same data and it works.
  • Example 3: You can have ANY junk in the data section so long as all the TOC entries point to a valid file header. Not every piece of data has to be "accounted" for by the TOC. There can be data not used.

LGP Editor uses this to its advantage in the Advanced Editor. If you want to replace a file in an LGP archive with your own copy, it just puts the file on the end of the LGP, writes a new file terminator, and updates the TOC to point at the new file. It even lets you link two TOC entries to the same data or have "inactive" files in the archive that aren't referenced by any TOC entry.

I don't know whether the file terminator has to be intact, but for safety's sake my editor preserves it. The CRC must be present and correct. Also, if you're replacing an archive with you're own custom version make sure it has filenames in the TOC matching the ones in the old one.

The game doesn't check archive sizes as long as all filenames are present. So if you want, you could replace an archive containing 95 files with a 98-file archive, so long as 95 of those 98 names matched those present in the original 95-file archive. (However there's no point in doing this when the game won't use any files other than the 95 it's expecting to find).

There are reports on Qhimm's board that once you've altered an archive and the game refuses to read it, it won't ever read it until you reinstall - even if you fix the problem/restore from a backup. The idea was generally scorned and ignored, but I'll mention it because something like that happened to me. No solid conclusion can be drawn here.

Sometimes, there are data "gaps" in the file that don't appear to be referenced by any file - even by an inactive file. If you're only using the TOC method to get at files (the easy way) then you won't notice this anyway. However, if you're stepping through the file header by header, even reading the unused ones, this can cause problems. If you use my program to update a file with one that's smaller than the original (can happen) then it writes it in, but leaves a gap after it (of course). However, to help you out, after the end of the file, it writes a 4 byte integer saying how much more space to skip over to reach the next file header. This really doesn't affect many things - only tools (like my Advanced LGP Editor) that bypass the TOC to construct their own file lists. FF7 never notices a thing.

Useful downloads

Below there are links to known programs that are capable to edit LGP archives:

  • LGP Tools - with an Advanced LGP Editor allowing edit archive thoughoutly
  • Emerald - has mass extracting/repacking function
  • Unmass - general file extractor with LGP archives support