Open to Collab

14 2 19

AbstractPhila PRO

AbstractPhil

https://civitai.com/user/AbstractPhila

AbstractEyes

AI & ML interests

datasets, research papers, experimentation, vision, classification, text encoders, tokenization, llms, diffusion, distillation, and more.

Recent Activity

replied to their post 36 minutes ago

geolip-captionbert-8192 This bert is currently being distilled using 5 bert teachers using the conceptual captions dataset. The recall accuracy is based on the whitened procrustes alignment, and the losses reflect keeping that rotation aligned correctly. The expectation from the smaller prototypes show this model will align to 100% accuracy recall based on the most optimal opinions based on the correct answer, aligning specifically to the correct answers in conjunction with all the geometric losses. No joke, this may be the smallest, least computation, most accurate, and fastest bert I've trained thus far - and it will be based entirely on five teachers simultaneously feeding opinions through a relay hub.

updated a model about 2 hours ago

AbstractPhil/geolip-autograd-induction-experiments

published a model about 2 hours ago

AbstractPhil/geolip-autograd-induction-experiments

View all activity

Organizations

replied to their post 36 minutes ago

I've noticed some definite data overlap in the system - a percentage of the validation data has been trained, the R1 constraints aren't necessarily perfect in the current state. Even without the overlap, the early stage model can be directly trained to be R1@100% within 3 epochs, so this isn't the crucial fail point. The fail point is in the large scheme tests having potential overlap, and these cannot have overlap for the big train.

This overlap happened when I switched from the 200k set to the 500k set, so the full 12 million set will need a new validation target other than itself. I only ran it twice, but both times likely bled 20k images of mixed origin from the other, which likely is less than 8% or so bleedover but it's enough to taint the outcome.

I'll require another dataset to validate, something completely removed from the attribution and completely differentiated. I'm going to likely use my own dataset as validation, which is essentially a billion trash prompts that cannot simply be solved, and often make zero sense.

Even a small percentage of the validation data having been trained is enough for me to resort to extreme measures, not to mention the damn thing is reporting R1 100% all the time which is annoying me. I want to see a legitimate series of impossible combinations that cannot be represented, essentially garbage noise mixed with pure captions that the model has never learned from.

The model cannot easily solve these, which will give a perfect measure. I'd say maybe a million of these will be the best possible impossible goal.

updated a model about 2 hours ago

AbstractPhil/geolip-autograd-induction-experiments

Updated about 1 hour ago

published a model about 2 hours ago

AbstractPhil/geolip-autograd-induction-experiments

Updated about 1 hour ago

posted an update about 3 hours ago

Post

geolip-vit-x34 - 34 expert vit. I can't train an extended version of 34 vits, but I can definitely run some experiments and make some starter weights with an anchor. That would yield a substantial amount of data.

AbstractPhil/bulk-coco-features

This... is going to be a odd one to describe. Based on the research with Bert, creating a uniformed patchwork using a multitude of vit composites will be very achievable. It shouldn't be soup, which is really hard to explain, but by creating a second geometric anchor, the system will align in a way that I could never predict without many more model analysis and must test. I simply didn't test all these vits for geometry, so this will be the test.

This is essentially 34 directly extracted views of coco, which is already prepared feature data. With this data, we have 34 experts that can distill into a single unified vit. I'm hesitant to even call this distillation anymore, it's more interpolative data alignment, and it's absurdly retentive.

ADDITIONALLY, we can anchor to frozen geolip-bert and create cross-contrast between the anchors for a learned anchor median, which will allow further integrations directly into the geometric core.

This will require a few overlapping internal mechanisms to guarantee vit differentiation, however I believe the full unified patchwork will be... different from what is currently known as a vit.

geolip-bert-vit will likely be cooking within the month. The alignment statistics say it will be... 100% accurate to the specifications.

I CAN prepare 34 vits worth of imagenet, but I would need probably 34 vits worth of laion aesthetics, which is substantially more than I currently have. In the process I would need to ensure everything isn't corrupt, and the captions are correctly synthesized in our expert student bert with the correct anchoring rotation.

Probably 3 vits is enough for the full version prototype, 34 vits for the bulk experiment.

updated a dataset about 13 hours ago

AbstractPhil/conceptual-captions-12m-webdataset-berts

Updated 2 minutes ago

published a dataset about 14 hours ago

AbstractPhil/conceptual-captions-12m-webdataset-berts

Updated 2 minutes ago

updated a model about 22 hours ago

AbstractPhil/geolip-axis-prototype

Updated about 21 hours ago

published a model about 22 hours ago

AbstractPhil/geolip-axis-prototype

Updated about 21 hours ago

updated a model about 23 hours ago

AbstractPhil/geolip-captionbert-8192

Feature Extraction • Updated about 13 hours ago • 509

updated a collection 1 day ago

GeoLIP

Collection

A series of useful models expert-trained using the GEOLIP distillation process. • 4 items • Updated 1 day ago

replied to their post 1 day ago

Predominantly I'm using a single G4 card on colab. The R6000 series is quite fast and effective at this sort of task.

replied to their post 1 day ago

There's a fairly straightforward way to distill the data actually, it's directly in the geolip-bertenstein trainer code, it just takes a bit of fiddling and it can be hooked to anything. I'm working out a paradigm that makes this reusable.

replied to their post 1 day ago

Needs a larger influx of data, I'm going to funnel a few million captions into it before I declare it complete. Currently it's still a learner, even though it stays perfect. The CV leaves tons of room for expansion.

published a model 1 day ago

AbstractPhil/geolip-captionbert-8192

Feature Extraction • Updated about 13 hours ago • 509

replied to their post 1 day ago

It's cooking right now. Would you like to play with it?

It's already ready.

posted an update 2 days ago

Post

132

geolip-captionbert-8192

This bert is currently being distilled using 5 bert teachers using the conceptual captions dataset. The recall accuracy is based on the whitened procrustes alignment, and the losses reflect keeping that rotation aligned correctly.

The expectation from the smaller prototypes show this model will align to 100% accuracy recall based on the most optimal opinions based on the correct answer, aligning specifically to the correct answers in conjunction with all the geometric losses.

No joke, this may be the smallest, least computation, most accurate, and fastest bert I've trained thus far - and it will be based entirely on five teachers simultaneously feeding opinions through a relay hub.

9 replies

published an article 2 days ago

Article

Geometric Memory II: Sequence Reconstruction, Diffusion Integration, and the Numerical Topology of Alignment

2 days ago

updated 2 models 2 days ago

AbstractPhil/geolip-bertenstein

Feature Extraction • Updated 2 days ago

AbstractPhil/geolip-clip-vit-bigG-patch14-ctx576-seq77

Feature Extraction • Updated 2 days ago • 161

AbstractPhila PRO

AI & ML interests

Recent Activity

Organizations

AbstractPhil's activity

Geometric Memory II: Sequence Reconstruction, Diffusion Integration, and the Numerical Topology of Alignment