Author Topic: Split or Explode  (Read 3020 times)

0 Members and 1 Guest are viewing this topic.

Offline DorkeyDear

  • Veteran
  • *****
  • Posts: 1507
  • I also go by Curt or menturi
Split or Explode
« on: June 17, 2008, 11:34:50 am »
Code: [Select]
function Time: cardinal;
begin
  Result := 60000 * StrtoInt(FormatDate('nn')) + 1000 * StrtoInt(FormatDate('ss')) + StrtoInt(FormatDate('zzz'))
end;

function Split(const Source: string; const Delimiter: string): tstringarray;
var
  i,x,d: integer;
  s: string;
begin
  d := Length(Delimiter);
  x := 0;
  i := 1;
  SetArrayLength(Result,1);
  while i <= Length(source) do begin
    s := Copy(Source,i,d);
    if s = Delimiter then begin
      Inc(i,d);
      Inc(x,1);
      SetArrayLength(result,x + 1);
    end else begin
      Result[x] := Result[x] + Copy(s,1,1);
      Inc(i,1);
    end;
  end;
end;

function SplitB(const Source: string; const Delimiter: string): array of string;
var
  i,x,d: integer;
  s: string;
begin
  d := Length(Delimiter);
  x := 0;
  i := 1;
  SetArrayLength(Result,1);
  while i <= Length(source) do begin
    s := Copy(Source,i,d);
    if s = Delimiter then begin
      Inc(i,d);
      Inc(x,1);
      SetArrayLength(result,x + 1);
    end else begin
      Result[x] := Result[x] + Copy(s,1,1);
      Inc(i,1);
    end;
  end;
end;

function Explode(Source: string; const Delimiter: string): array of string;
var
  TempStr: string;
begin
  Source := Source + Delimiter;
  repeat
    TempStr := GetPiece(Source, Delimiter, 0);
    SetArrayLength(Result, GetArrayLength(Result) + 1);
    Result[GetArrayLength(Result) - 1] := TempStr;
    Delete(Source, 1, Length(TempStr) + Length(Delimiter));
  until Length(Source) = 0;
end;

procedure ActivateServer();
var
  i: integer;
  TempA, TempB, TempC: cardinal;
begin
  TempA := Time;
  for i := 1 to 10000 do begin
    Split('This is only a test with spaces', ' ');
  end;
  TempB := Time;
  WriteLn(InttoStr(TempB - TempA));
  for i := 1 to 10000 do begin
    SplitB('This is only a test with spaces', ' ');
  end;
  TempC := Time;
  WriteLn(InttoStr(TempC - TempB));
  for i := 1 to 10000 do begin
    Explode('This is only a test with spaces', ' ');
  end;
  WriteLn(InttoStr(Time - TempC));
end;

Ran this a couple times:
35531
38797
13797

29797
24641
12687

22390
29313
9547

Then I repeated it again...
'  now      lets test          some odd    !!!! longer  splitedness 16 45485484 148 45 456 18 4 51 894 518 1 8 484 45 4 84 5 1 89 4 56484 1 4 84 1 4 4181 4  94594567 1854 8 487 4 54 48 1 8 4', ' '

29407
33859
16719

27344
29437
8547

26187
23235
11453

Then I wanted to test it with a really long string...
so i added:
  Str: string;
and
  for i := 1 to 1000 do Str := Str + Chr(Random(32, 40));
  WriteLn('--start--');
and I changed the trials from 10000 to 1000 (as well as the string being split to Str)

116484
111688
93921

75703
84750
58875

82032
83843
44922

(
After I started the server, I started eatting cheese and crackers, so I wasn't interupting it at all.. :P
Yumm, smoked hot pepper cheese on ritz; what a mouth watering breakfast
)

I have Split and SplitB both in there to compare how the speed acts using either a tstringarray or an array of string

I'm also not sure what "GetPiece" actually does on the insides, so there can probably be a faster way than this explode, although it wouldn't be as short looking ^^

afaik, Split and Explode do the same.. I tested it with the delimiter at the end and start also, which it seems to work.

EDIT A LONG TIME LATTER:
After doing some more testing, I've made my function much faster than it was before! Here is a code, along with all my tests and numbers in it:
Code: [Select]
function Time: cardinal;
begin
  Result := 60000 * StrtoInt(FormatDate('nn')) + 1000 * StrtoInt(FormatDate('ss')) + StrtoInt(FormatDate('zzz'))
end;

//  2500 -  6360  6360  6375
// 10000 - 25718 26578 25750
function Explode(Source: string; const Delimiter: string): array of string; // Variable initializations removed - faster
var
  Position, DelLength, ResLength: integer;
begin
  DelLength := Length(Delimiter);
  while (true) do begin
    Position := StrPos(Delimiter, Source);
    if (Position = 0) then
      break;
    ResLength := ResLength + 1;
    SetArrayLength(Result, ResLength);
    Result[ResLength - 1] := Copy(Source, 1, Position - 1);
    Delete(Source, 1, Position + DelLength - 1);
  end;
  SetArrayLength(Result, ResLength + 1);
  Result[ResLength] := Source;
end;

//  2500 -  6641  6328  6672
function Explode_4(Source: string; const Delimiter: string): array of string; // Inc -> := - faster!
var
  Position, DelLength, ResLength: integer;
begin
  Result := [];
  ResLength := 0;
  DelLength := Length(Delimiter);
  while (true) do begin
    Position := StrPos(Delimiter, Source);
    if (Position = 0) then
      break;
    ResLength := ResLength + 1;
    SetArrayLength(Result, ResLength);
    Result[ResLength - 1] := Copy(Source, 1, Position - 1);
    Delete(Source, 1, Position + DelLength - 1);
  end;
  SetArrayLength(Result, ResLength + 1);
  Result[ResLength] := Source;
end;

//  1000 -  3000  3188  2968
//  2500 -  7531  7422  7813
function Explode_3(Source: string; const Delimiter: string): array of string; // GetArrayLength / Length -> Variables - faster!!
var
  Position, DelLength, ResLength: integer;
begin
  Result := [];
  ResLength := 0;
  DelLength := Length(Delimiter);
  while (true) do begin
    Position := StrPos(Delimiter, Source);
    if (Position = 0) then
      break;
    Inc(ResLength, 1);
    SetArrayLength(Result, ResLength);
    Result[ResLength - 1] := Copy(Source, 1, Position - 1);
    Delete(Source, 1, Position + DelLength - 1);
  end;
  SetArrayLength(Result, ResLength + 1);
  Result[ResLength] := Source;
end;

//  1000 -  4359 3828 3797
// 10000 - 41141
function Explode_2(Source: string; const Delimiter: string): array of string; // Delete -> Source := Copy - slower!!!
var
  Position: integer;
begin
  Result := [];
  while (true) do begin
    Position := StrPos(Delimiter, Source);
    if (Position = 0) then
      break;
    SetArrayLength(Result, GetArrayLength(Result) + 1);
    Result[GetArrayLength(Result) - 1] := Copy(Source, 1, Position - 1);
    //Delete(Source, 1, Position + Length(Delimiter) - 1);
    Source := Copy(Source, Position + Length(Delimiter), Length(Source));
  end;
  SetArrayLength(Result, GetArrayLength(Result) + 1);
  Result[GetArrayLength(Result) - 1] := Source;
end;

//  1000 -  3360  3500  3500
//  2500 -  9032  9828  9078
function Explode_1(Source: string; const Delimiter: string): array of string; // Remodel origonal
var
  Position, DelLength: integer;
begin
  Result := [];
  DelLength := Length(Delimiter);
  while (true) do begin
    Position := StrPos(Delimiter, Source);
    if (Position = 0) then
      break;
    SetArrayLength(Result, GetArrayLength(Result) + 1);
    Result[GetArrayLength(Result) - 1] := Copy(Source, 1, Position - 1);
    Delete(Source, 1, Position + Length(Delimiter) - 1);
  end;
  SetArrayLength(Result, GetArrayLength(Result) + 1);
  Result[GetArrayLength(Result) - 1] := Source;
end;

//  1000 - 10094 10203 10954
//  2500 - 24703 23594 25156
function Explode_0(Source: string; const Delimiter: string): array of string; // GetPiece origonal
var
  TempStr: string;
begin
  Source := Source + Delimiter;
  repeat
    TempStr := GetPiece(Source, Delimiter, 0);
    SetArrayLength(Result, GetArrayLength(Result) + 1);
    Result[GetArrayLength(Result) - 1] := TempStr;
    Delete(Source, 1, Length(TempStr) + Length(Delimiter));
  until Length(Source) = 0;
end;

procedure WriteArrayString(const AryStr: array of string);
var
  i: integer;
begin
  for i := 0 to GetArrayLength(AryStr) - 1 do
    WriteLn('[' + InttoStr(i) + '] = ''' + AryStr[i] + '''');
end;

procedure ActivateServer();
var
  i: integer;
  TempA, TempB: cardinal;
  ResultArray: array of string;
begin
//WriteArrayString(Explode('  one two three  five  ', ' '));

  TempA := Time();
  for i := 1 to 10000 do
    Explode('  now      lets test          some odd    !!!! longer  splitedness 16 45485484 148 45 456 18 4 51 894 518 1 8 484 45 4 84 5 1 89 4 56484 1 4 84 1 4 4181 4  94594567 1854 8 487 4 54 48 1 8 4', ' ');
  TempB := Time;
  WriteLn(InttoStr(TempB - TempA));

end;
I'm not doing this in a completely static environment, so accuracy of these numbers isn't too high, but it should be accurate enough to give an idea of what is faster to do and what isn't.

The current (up most) Explode function in the code above is the best that has been derived.

EDIT AGAIN:
Latter that day (today), CurryWurst and I worked on the function even more, concluding in an even faster version, which may be found here
« Last Edit: August 10, 2009, 02:45:50 pm by DorkeyDear »