JJ
2022-03-09 06:04:18 UTC
Accessing large file was really slow when the needed data is at the end or
near the end of the file, because the data needs to be read from the start
until the needed file offset.
But I've finally found it. A seekable file stream ActiveX built in the
Windows itelf. No third party software required. The object was found in an
unexpected place/classification: the Speech API. With automation object
named `SAPI.SpFileStream`.
https://docs.microsoft.com/en-us/previous-versions/windows/desktop/ms722561(v=vs.85)
The object is meant for handling WAV audio files, but it also support raw
format or formatless.
With it, we can do faster processing of large files e.g. in-place patching
such as fixing frame rate or audio sampling rate of video files, or extract
part of a file from the middle or the end.
A huge bonus is that, it supports files larger than 4GB as long as the file
system supports it (tested with 6GB file; read and write). It's Read()
method gives Byte(). i.e. literal array of bytes. Not a variant array. It's
like ADDB.Stream's Read().
And best of all, the object already exists at least since Windows XP.
But it's not perfect. It has several limitations:
- There's no way to truncate a file. i.e. delete data at current file offset
and shrink the file size. The equivalent for the SetEndOfFile() Windows API.
This functionality is still absent from VBScript, without third party
software.
- Seek() can not be used to increase file size. Data must be written to
increase file size.
- There's no way to create and overwrite existing file as raw format. While
there's SSFMCreateForWrite (3) file mode to create and overwrite existing
file, it forces the format to WAV type 22, where the WAV header is
automatically written. The only exception is Windows XP, where
SSFMCreateForWrite file mode still only support raw format.
And one caveat:
- Passing a string to Write() will write the given string only in UTF-16
encoding, _plus_ one Null character.
Additional information which is not found in the documentation:
- On object creation, the default format type is SAFTNoAssignedFormat (0),
instead of SAFTDefault (-1).
- Format GUIDs:
- SAFTNoAssignedFormat (0): {00000000-0000-0000-0000-000000000000}
- SAFT22kHz16BitMono (22): {C31ADBAE-527F-4FF5-A230-F62BB61FF70C}
near the end of the file, because the data needs to be read from the start
until the needed file offset.
But I've finally found it. A seekable file stream ActiveX built in the
Windows itelf. No third party software required. The object was found in an
unexpected place/classification: the Speech API. With automation object
named `SAPI.SpFileStream`.
https://docs.microsoft.com/en-us/previous-versions/windows/desktop/ms722561(v=vs.85)
The object is meant for handling WAV audio files, but it also support raw
format or formatless.
With it, we can do faster processing of large files e.g. in-place patching
such as fixing frame rate or audio sampling rate of video files, or extract
part of a file from the middle or the end.
A huge bonus is that, it supports files larger than 4GB as long as the file
system supports it (tested with 6GB file; read and write). It's Read()
method gives Byte(). i.e. literal array of bytes. Not a variant array. It's
like ADDB.Stream's Read().
And best of all, the object already exists at least since Windows XP.
But it's not perfect. It has several limitations:
- There's no way to truncate a file. i.e. delete data at current file offset
and shrink the file size. The equivalent for the SetEndOfFile() Windows API.
This functionality is still absent from VBScript, without third party
software.
- Seek() can not be used to increase file size. Data must be written to
increase file size.
- There's no way to create and overwrite existing file as raw format. While
there's SSFMCreateForWrite (3) file mode to create and overwrite existing
file, it forces the format to WAV type 22, where the WAV header is
automatically written. The only exception is Windows XP, where
SSFMCreateForWrite file mode still only support raw format.
And one caveat:
- Passing a string to Write() will write the given string only in UTF-16
encoding, _plus_ one Null character.
Additional information which is not found in the documentation:
- On object creation, the default format type is SAFTNoAssignedFormat (0),
instead of SAFTDefault (-1).
- Format GUIDs:
- SAFTNoAssignedFormat (0): {00000000-0000-0000-0000-000000000000}
- SAFT22kHz16BitMono (22): {C31ADBAE-527F-4FF5-A230-F62BB61FF70C}