Discussion:
Access the 3rd and rest elements of a Byte() (byte array)?
(too old to reply)
JJ
2022-03-07 23:24:27 UTC
Permalink
How to access the 3rd and rest elements of a Byte() (byte array)? Note: the
byte array is an array of bytes. It's not a variant array of bytes.

I could only access the first, the second, but I can't figure out how to
access the 3rd and rest of elements.

Below is an example whose input byte array is simulated. Actual byte array
came from an ActiveX where I can't control the element index or how many
elements to retrieve.

'simulate byte array creation
set ds = createobject("adodb.stream")
ds.open
ds.charset = "windows-1252"
ds.writetext "abcdefghijklmnopqrstuvwxyz"
ds.position = 0
ds.type = 1

arr = ds.read 'simulated byte array retrieval

'byte array access...
msgbox asc(a) 'first element as byte
v = ascw(a) 'first & second element as int (16-bit integer)
msgbox v \ 256 'second byte of above int (16-bit integer)
'how to access 3rd and rest of elements?
R.Wieser
2022-03-08 10:22:57 UTC
Permalink
JJ,
Post by JJ
msgbox asc(a) 'first element as byte
Whut ? Where did you create/load the data into that "a" variable ? Whats
actually in it ?

Also, why do you think that "asc()" has antyhing to do with an array ?
Post by JJ
v = ascw(a) 'first & second element as int (16-bit integer)
No.

It just tries to convert the *wide-string character* in the variable "a"
into a value. If that "a" variable would contain a string (and not a
single character) than it just takes the first character (*not* byte).

In your case that is moot though, as you have set your stream to binary. As
far as I can tell that means that *no multi-byte-character conversion* is
done, and each read byte is placed in its own array element - even if it
could be part of a multi-byte character.

I'm probably misunderstanding your problem, but here goes :

To access each character (byte?) in a string you need to use a mid(arr,i,1)

Regards,
Rudy Wieser
Mayayana
2022-03-08 13:07:56 UTC
Permalink
"R.Wieser" <***@not.available> wrote

| To access each character (byte?) in a string you need to use a
mid(arr,i,1)
|

I don't think it's possible with straight VBS because it's
looking for a variant. You can take the byte data and
walk the position, reading 1 byte at a time. With that it's
possible to get a, b, c, d, etc. But that only works as long
as you don't actually look at it programmatically. If you
read 1 byte at position 4 into x, you can do MsgBox x and
get "e". But you can't touch "e". It's datatype byte array.
You can't convert it. You can't get to it as x(0). Because
it's not a variant.

JJ theoretically created a byte string of ANSI characters,
but as Stream binary that's a byte array. I've written
AxEXEs to handle that kind of thing, but I write the
functions to take and return only variants.

FSO can do some operations by working with ANSI
string data, if you're careful not to "look null characters
in the eye". I've written a class that way that allows me
to work with binary files. But FSO still can't take in non-
variant data.

Maybe ADODB can process it. I don't know. I've never really
worked with that. WIA can access binary data, but it's
somewhat limited. And again, you have to just tell WIA
what you want it to do with that data. You can't use VBS
to access it directly.

Long story short, VBS is the wrong tool.
R.Wieser
2022-03-08 20:15:42 UTC
Permalink
Mayayana,
Post by Mayayana
I don't think it's possible with straight VBS because
it's looking for a variant.
Yes, and no*. But I am afraid I was a bit to quick with my reply, making
assumptions I should not have. "mid(...)" doesn't work. JJ, my
apologies.

* JJ mentioned that both "asc()" and "ascw()" work. I just found that
"wscript.echo" also works - though it will try to interpret each two bytes
as a single wide-character symbol. In my case got 13 question-marks.
Alternating the written chars with a bunch of "chr(0)"-s causes the "abc"...
string to be displayed.
Post by Mayayana
It's datatype byte array. You can't convert it. You can't get to it
as x(0). Because it's not a variant.
I agree that thats pretty-much it.

But you know, its a bit funny : I've written a few OCX objects with methods
which do not accept variants at all (just DWORD integers / floats and
BStrings), and I can use them in VBS without a problem. IOW, VBS does a lot
of conversion below the surface. Just not here.
Post by Mayayana
Long story short, VBS is the wrong tool.
:-) Straight-up VBS isn't really usable for much anything. The very moment
you want to actually /do/ something with it you need to load external
objects. Most often starting with the filesystem one.

Regards,
Rudy Wieser
Mayayana
2022-03-08 21:57:56 UTC
Permalink
"R.Wieser" <***@not.available> wrote

| * JJ mentioned that both "asc()" and "ascw()" work. I just found that
| "wscript.echo" also works - though it will try to interpret each two bytes
| as a single wide-character symbol. In my case got 13 question-marks.
| Alternating the written chars with a bunch of "chr(0)"-s causes the
"abc"...
| string to be displayed.
|

That seems odd. He specced it as ANSI English before declaring
it binary. So each byte should be a character. This works fine for me:

Set ds = createobject("adodb.stream")
ds.open
ds.charset = "windows-1252"
ds.writetext "abcdef"
ds.position = 0
ds.type = 1

For i = 0 to ds.size - 1
ds.position = i
x = ds.read(1)
MsgBox x
Next

ds.close
Set ds = Nothing

Oddly, if I don't set position to 0 first it says the operation
is not allowed. But as written I get a msgbox for a,b,c,d,e,f.
The trouble is that x is datatype byte() and VBS chokes on it.
JJ
2022-03-09 03:58:23 UTC
Permalink
Post by R.Wieser
Yes, and no*. But I am afraid I was a bit to quick with my reply, making
assumptions I should not have. "mid(...)" doesn't work. JJ, my
apologies.
Hold on there, I didn't know that VB can convert Byte() to a string. In this
case, MID() can actually be used.

e.g. if the Byte() array contains:

F1, F2, F3, F4, F5, F6

hex(ascw(mid(arr, 1, 1))) would be F2F1.
hex(ascw(mid(arr, 2, 1))) would be F4F3.
hex(ascw(mid(arr, 3, 1))) would be F6F5.

But since MID() only accept a string, the Byte() value must be converted to
a string first, so for performance's sake, if this method is used
(especially in a loop), MID() should use a string which was preconverted.

Since I notice that VBScript can't(won't?) convert the last byte of an odd
length Byte() into a UTF-16 character (FYI, JScript can), a different method
must be used.

e.g. if the Byte() array contents is:

F1, F2, F3, F4, F5

len(arr) would only be 2. So, we can't use MID(arr, 3, 1).
Instead, we use RIGHT(), which unexpectedly, works. i.e.:

hex(ascw(right(arr, 1))) would be F5F4.

I was expecting it to be F4F3 because I assumed that it works just like
MID(), but I was wrong. It actually retrieve the last two bytes from the end
of the array, instead of from the last even index of the array.

So, thank you for mentioning MID().
R.Wieser
2022-03-09 09:39:06 UTC
Permalink
JJ,
Post by JJ
Hold on there, I didn't know that VB can convert Byte() to a string.
As far as I can tell, it stil doesn't. But for some reason wscript.echo
accepts a Byte-array too.
Post by JJ
But since MID() only accept a string,
I'm not sure I understood you right, as you posted three lines you said
work, but than follow it with the above ...

Yes, a quick test shows that MID() indeed does work, but - unexpectedly -
grabs two bytes at a time.
Post by JJ
Instead, we use RIGHT(), which unexpectedly, works.
:-) Currently the only unexpected thing with MID() and RIGHT() is that, and
/how/ they work on a Byte() array. Ofcourse, it didn't help that you need
an ASCW() (instead of just an ASC() ) to see the actual bytes.
Post by JJ
So, thank you for mentioning MID().
You're welcome. I'm just glad that my fumbling lead to something usefull.

But forgive me if I say that I probably won't ever try to use that
"adodb.stream" object in VBScript for anything. The above juggeling brings
back memories of how to do string stuff in DOS Batch files. :-|

Regards,
Rudy Wieser
Ulrich Möller
2022-03-11 10:23:11 UTC
Permalink
Post by Mayayana
| To access each character (byte?) in a string you need to use a
mid(arr,i,1)
|
I don't think it's possible with straight VBS because it's
looking for a variant. You can take the byte data and
walk the position, reading 1 byte at a time. With that it's
possible to get a, b, c, d, etc. But that only works as long
as you don't actually look at it programmatically. If you
read 1 byte at position 4 into x, you can do MsgBox x and
get "e". But you can't touch "e". It's datatype byte array.
You can't convert it. You can't get to it as x(0). Because
it's not a variant.
option explicit
dim ds, arr, i, n, h

stop
set ds = createobject("adodb.stream")
ds.open
ds.charset = "windows-1252"
ds.writetext "AaBbCc"
ds.position = 0
ds.type = 1
arr = ds.read

for  i = 0 to ubound(arr)
  n = ascb(midb(arr,i+1,1))
  h = cint(n)
  wscript.echo h
next

With these lines I have no problem to access the array byte by byte. n
is of type byte and can be converted arbitrarily. The trick is to use
MidB().

Uriclh
R.Wieser
2022-03-11 12:14:06 UTC
Permalink
Ulrich,
Post by Mayayana
| To access each character (byte?) in a string you need to use a
mid(arr,i,1)
The trick is to use MidB().
I can't remember having ever seen "midb()" used (forgotten all about it),
and never new that "ascb()" existed. Thanks for the heads-up and example.
Stored for future use. :-)

Regards,
Rudy Wieser
Mayayana
2022-03-11 14:12:06 UTC
Permalink
"Ulrich Möller" <***@arcor.de> wrote

| set ds = createobject("adodb.stream")
| ds.open
| ds.charset = "windows-1252"
| ds.writetext "AaBbCc"
| ds.position = 0
| ds.type = 1
| arr = ds.read
|
| for i = 0 to ubound(arr)
| Â n = ascb(midb(arr,i+1,1))
| Â h = cint(n)
| Â wscript.echo h
| next
|
| With these lines I have no problem to access the array byte by byte.

Very nice. Though still mysterious. The charset
should have defined the string as ANSI. When that
was converted to binary it should have been an
array of bytes corresponding to each character.
Instead, by referring to array elements
in the way that I was, using size, I was apparently
getting an array of two elements for each character,
representing a unicode character. So there's transparent
unicode conversion. Yuck. This seems to be similar to VB.
You can handle various text formats, and presumably
if you wrote that string to disk you'd get an ANSI file.
But behind the scenes it's dealing in unicode-16.
Ulrich Möller
2022-03-11 17:41:11 UTC
Permalink
Hi,
Post by Mayayana
| set ds = createobject("adodb.stream")
| ds.open
| ds.charset = "windows-1252"
| ds.writetext "AaBbCc"
| ds.position = 0
| ds.type = 1
| arr = ds.read
|
| forĀ i = 0 to ubound(arr)
| Ā n = ascb(midb(arr,i+1,1))
| Ā h = cint(n)
| Ā wscript.echo h
| next
|
| With these lines I have no problem to access the array byte by byte.
Very nice. Though still mysterious. The charset
should have defined the string as ANSI.
Please notice: "Windows-1252" corresponds to ANSI.
Post by Mayayana
When that was converted to binary it should have been an
array of bytes corresponding to each character.
Instead, by referring to array elements
in the way that I was, using size, I was apparently
getting an array of two elements for each character,
representing a unicode character. So there's transparent
unicode conversion. Yuck.
No. AdoStream.Read returns a variant byte array when the stream is
switched to binary mode. You can even specify the number of bytes. There
is no transparent UNICODE conversation behind the scene
Post by Mayayana
This seems to be similar to VB.
You can handle various text formats, and presumably
if you wrote that string to disk you'd get an ANSI file.
But behind the scenes it's dealing in unicode-16.
Sorry, but this has nothing at all to do with Unicode-16.

Here is another little code snippet on how to create a byte array from a
string:
Dim oEncoding, aBytes

Set oEncoding = CreateObject("System.Text.ASCIIEncoding")
aBytes = oEncoding.GetBytes_4("abcdef")
wscript.echo ascb(midb(aBytes,3,1))

Greetings
Ulrich
Mayayana
2022-03-11 20:58:41 UTC
Permalink
"Ulrich Möller" <***@arcor.de> wrote

| > Very nice. Though still mysterious. The charset
| > should have defined the string as ANSI.

| Please notice: "Windows-1252" corresponds to ANSI.

Yes, that's what I'm saying.

| No. AdoStream.Read returns a variant byte array when the stream is
| switched to binary mode. You can even specify the number of bytes. There
| is no transparent UNICODE conversation behind the scene

It seems to be ambiguous. If it were a byte array
I should be able to return the value of arr(0), but
that doesn't work. AscB and MidB are returning bytes
from a string. So it's a unicode string that you're able to
partially access as an array.

The ubound is 5. But your code is actually doing "for 1 to 6,
give me the first byte of the unicode string character at
that offset". Yes, it's specced as ANSI. But ADODB is
converting it internally. Very gross stuff. I knew there was
a reason I never used ADODB. :)

Try doing it without the B. You'll get question marks.
ASCII 63. I don't see how you knew to use B if you
thought it was ANSI or if you thought it was actually
an array. MidB wouldn't be relevant in either case.
JJ
2022-03-11 23:24:49 UTC
Permalink
The trick is to use MidB().
Dang it, Microcrap! They didn't mention there's a byte version of MID() in
the VBScript documentation. MIDB() solves everything.
Mayayana
2022-03-12 00:30:12 UTC
Permalink
"JJ" <***@gmail.com> wrote

| > The trick is to use MidB().
|
| Dang it, Microcrap! They didn't mention there's a byte version of MID() in
| the VBScript documentation. MIDB() solves everything.

AscB, AscW, MidB, LenB... they're all in the wscript
help file. But they're not the kind of thing that one
often has occasion to use.

Loading...